I. Introduction
Significant progress has been made in autonomous driving since the first successful demonstration in the 1980s [1] and the DARPA Urban Challenge in 2007 [2]. It offers high potential to decrease traffic congestion, improve road safety, and reduce carbon emissions [3]. However, developing reliable autonomous driving is still a very challenging task. This is because driverless cars are intelligent agents that need to perceive, predict, decide, plan, and execute their decisions in the real world, often in uncontrolled or complex environments, such as the urban areas shown in Fig. 1. A small error in the system can cause fatal accidents.
A complex urban scenario for autonomous driving. The driverless car uses multi-modal signals for perception, such as RGB camera images, LiDAR points, Radar points, and map information. It needs to perceive all relevant traffic participants and objects accurately, robustly, and in real-time. For clarity, only the bounding boxes and classification scores for some objects are drawn in the image. The RGB image is adapted from [4].