1. Introduction
3D object detection from point cloud data is a key component in Autonomous Vehicle (AV) system. Unlike the ordinary 2D object detection which only estimates 2D bounding box from an image plane, AV requires to estimate a more informative 3D bounding box from the real world to fulfill the high-level tasks like path planning and collision avoidance. This motivates the recently emerged 3D object detection approaches which apply the convolutional neural network (CNN) to process more representative point cloud data from a high-end LiDAR sensor.
Predicted bounding boxes from sparse 3D point cloud by (a) the representative single-stage detector SECOND [25] and (b) our single-stage method guided by auxiliary tasks and point-level supervisions. The object points, ground-truth box, center points predicted by the auxiliary network and the final detection results are shown in green, white, yellow and red colors, respectively.