I. Introduction
Recent breakthroughs in deep learning and sensor technologies have motivated rapid development of autonomous driving technology, which could potentially improve road safety, traffic efficiency and personal mobility [1]–[3]. However, technical challenges and the cost of exteroceptive sensors have constrained current applications of autonomous driving systems to confined and controlled environments in small quantities. One critical challenge is to obtain an adequately accurate understanding of the vehicle’s 3D surrounding environment in real-time. To this end, sensor fusion, which leverages multiple types of sensors with complementary characteristics to enhance perception and reduce cost, has become an emerging research theme.