The Making of Self-Driving Cars: A Developer’s Guide

Ever since the dawn on automobiles, self-driving cars have been a hot topic for sci-fi
fans, tech pioneers and sociologists. History has proven one of the key contributors
to general development and modern civili-zation is mobility, so it is no exaggeration
that a global-scaled autonomous transportation system will have an unprecedented
impact on our society – changing the way we live, work and travel.

by Árpád Takács, Outreach Scientist, AImotive

Human driving is a complex task. We can recognize and understand
the environment, plan and re-plan, control and adapt
in a fraction of a second. We can silently communicate with the
environment while we are driving, follow written and unwritten
rules, and heavily rely on our creativity. On the other hand,
today’s automation systems strongly follow an if–then basis
of operation, which hardly deployable in self-driving cars. We
cannot account for every single traffic scenario, or store the look
of every car model or pedestrian for better recognition.

To bridge this gap between current technological availability
and the demand from society and the mar-ket, many ideas
and prototypes have been introduced over the past few years.
Regardless of the technology details and deployability, there has
always been a common tool: machine learning, and through
that, Arti-ficial Intelligence (AI). Figure 1.

Figure 1

One of AImotive’s prototype vehicles
circling the streets of Budapest.
While testing AI-based algorithms
is a complex task, testing licenses
are now issued all over the world to
self-driving companies.

When ready for mass production, self-driving cars will be
the very first demonstration of AI in safety-critical systems on
a global scale. Although it might seem we are planning to trust
our lives to AI complete-ly, behind the wheel there will be a
lot more than just a couple of bits and bytes learning from an
instruc-tor, taking classes on millions of virtual miles.

A common approach to solve the problem of self-driving is to
analyze human driving, collect tasks and sub-tasks into building
blocks, and create a complete environment for self-driving car
development, nar-rowed down to the three main components:
algorithms, development tools, and processing hardware.

Algorithms: from raw information to a unified
understanding

The first, and possibly the most important component of
self-driving development is the set of algorithms used in various
building blocks for solving necessary tasks related to sensor
handling, data processing, per-ception, localization and vehicle
control. The ultimate goal at this
level is the integration of these
blocks into the central software
that runs in the car, which poses
several engineering challenges.
There is a hier-archy among
these tasks and subtasks, which
can be broken down to three
groups: recognition, localiza-tion
and planning.

However, not all of these specifically require AI-based
solutions. It is the developers’ responsibility and choice to find
the right balance between traditional and AI-based algorithms,
and, if needed, use a combi-nation of these for the very same
problem such as lane detection or motion planning. The choice
of algo-rithms and their fusion largely depends on the number
and types of sensors used on such a platform, which is the main
differentiator among developers.

There are no identical prototype platforms among the developer
communities, as these rely on infor-mation coming from
platform-specific combinations such as various cameras, Light
Detection and Rang-ing (LIDAR) units, radars and ultrasonic
sensors, or other external or internal devices. Historically,
rely-ing on LIDARs as primary sensors has been a standard way
to go, simultaneously solving recognition and localization tasks
through point-cloud matching and analyzing. However, human
drivers rely on their vision 99% of the time while driving; therefore
a camera-first approach is growing more popular by the day.
With today’s algorithms and processing capabilities, we are able
to extract not only the class, but the distance, size, orientation
and speed of objects and landmarks using only
cameras that are taking over the primary role of
radars or still expensive LIDARs.

Once the raw information from the sensors is
at hand, algorithms help us make sense of it all.
On the recognition layer, low-level sensor fusion
is needed for fusing raw information from various
sources, then multiple detection and classification
algorithms provide a basis for high-level sensor
fusion – an associa-tion of object instances to each
other over multiple time frames. The output of
the recognition layer is an abstract environment
model, containing all relevant information about
the surroundings. Figure 2.

The next layer is responsible for the absolute
localization of the vehicle globally, including routing,
map-ping, odometry and local positioning.
Knowing the exact, absolute location of the car
is an essential com-ponent for motion planning,
where HD- and sparse feature maps provide useful
information about the route the car is about to take.

While recognition and localization algorithms have reached
a very mature state in most applications, planning and decision
making are relatively new fields for developers. Naturally, as
the planner modules can only rely on the available information,
the quality and reliability of this layer largely depends on the
input from the recognition and localization layers. That said,
only an integrated, full-stack system devel-opment is feasible in
the future deployment of self-driving cars, one that has a deep
understanding of what each building block and layer requires,
acquires and provides.

In this context, the planning layer is responsible for understanding
the abstract scenario, object tracking and behavior
prediction, local trajectory planning and actuator control. To
give an example: this is the layer which understands that there is
a slow vehicle in our lane, explores free space for an overtaking
ma-neuver, decides if there is enough time before exiting the
highway through localization, and calculates an optimal trajectory
to be followed. While it all sounds quite simple through this
example, it still poses one of the largest challenges in self-driving
car development: the car has to carry out this maneuver in any
driv-ing scenario, with its decisions affecting the behavior of
other participants of the traffic and vice versa, playing a multiagent
game, where every player needs to win.

Tools from the drawer: how to train your AI

Today, AI is a general tool for solving various self-driving
related tasks, however, its everyday use has been narrowed
down to just a couple of fields. Sophisticated AI-based image
recognition techniques using deep convolutional networks
(DCNs), proved themselves against traditional computer vision
(CV) algo-rithms while neural networks (NNs) also provide
superior performance in decision making and planning through
recurrent network structures. A combination of these structures
into a vast black box network and letting the car learn how to
drive in a virtual environment is today referred to as end-to-end
learning.

Either way, one thing is common for all use cases: for the vast
amounts of training data, both positive and negative examples
are needed. In order to let such a system enter the roads,
evolving from a prototype of limited technological readiness to
a verified fail-safe vehicle, a structured set of development tools
should be provided to aid the algorithms and the software. These
tools should account for data handling, includ-ing data collection,
annotation (data labeling), augmented data generation,
pre- and post-processing, and sensor calibration. There is also
a need, however, for a constant algorithm support by flexible
training environments for the various AI algorithms, specialized
frameworks for the optimization of neural net-work architectures
and the previously mentioned high-level sensor fusion.

Once algorithms are trained, individual and complex
component testing is required to meet automotive standards
for safety and reliability, which requires objective measures of
precision, recall or false rejection rates, setting a demand for
benchmarking and verification tools for the very
specific problem of self-driving. The complexity
of the components and building blocks, and the
vast variety of possible driving scenarios, does
not allow developers to have a thorough field
testing. This gives rise to the market of com-plex,
photorealistic simulation environments and
open-sourced computer games. These platforms
not only allow us to test functionalities and
reproduce scenes, but also provide a training
environment for motion planning modules on
maneuvering and accident simulation. Figure 3.

Hardware: running in real-time

The downside of using AI-based deep learning
(DL) algorithms is the relatively high computational
ca-pacity required. We are in the phase of
hardware development where the utilized neural
network architec-tures still need to be downsized
and optimized for a real-time inference (data
processing), and as a trade-off, the precision and
reliability of algorithms suffer.

Furthermore, the only widely commercially available technology
for running such algorithms is provided by graphical
processing units (GPUs), but these are general processing hardware
not inherently optimized for self-driving specific networks.
As limited processing capabilities are now posing a bottleneck
in the productization of these vehicles, in answer to the demand
a new era has started where chip providers are rethinking chip
design to focus on the hardware acceleration of NN inference,
which will ultimately lead to an increased performance density
and allow automotive safety integrity level (ASIL) compliance.
Figure 4.

The transparency of the self-driving ecosystem is crucial: the
three components—algorithms, tools and hardware— cannot be
separated. This requires the simultaneous development of these
components, and the industry is striving for this structured approach.
Otherwise, what remains is just building blocks without
application, and an unsatisfied technological demand.

Author Bio:

Árpád Takács is an outreach scientist for AImotive. AImotive is
the leader in AI-powered motion, and the first company who will
bring Level 5, self-driving technology with a camera first sensor
approach to the global market. Árpád’s fields of expertise include
analytical mechanics, control engineering, surgical ro-botics and
machine learning. Since 2013, Árpád has served as a research
assistant at the Antal Bejczy Cen-ter for Intelligent Robotics at
Óbuda University. In 2016, he joined the R&D team of the Austrian
Center for Medical Innovation and Technology. Árpád received
his mechatronics and mechanical engineering modeling degree
from the Budapest University of Technology and Economics.

www. aimotive.com