I. Introduction
Map-based localization and sensing are one of the key components in autonomous driving technologies [40], [46], [48] where high quality 3D map reconstruction is fundamentally utmost important [66]. However, due to the highly dynamic and uncontrollable properties of real world environment, building a high quality 3D map is never easy. In literature, there are significant amount of research on 3D map reconstruction problems, representatively, the traditional image-based Structure-from-Motion [59] technique, the depth-image-based Truncated Signed Distance Function [56] approach, and the lidar-based Localization and Mapping [56], [88] method. Generally, such approaches achieve very nice results for nearly static environments, while the 3D reconstruction quality significantly degrades when facing highly dynamic and crowded environments, see Fig. 1 for example.
3D reconstruction using Lidar Odometery and Mapping (LOAM) [87] technique: top image shows a decent quality 3D map of a static campus environment, while the mapping result (bottom image) of the plaza is quite unsatisfactory due to the “ghost” artefacts (see the zoom-in area of red boxes) caused by moving objects.