I. Introduction
In Recent years, the combination of 3D LiDAR (Light Detection and Ranging) and cameras has become common in the fields of autonomous driving and mobile robotics. LiDAR sensors, with their good 3D ranging capability, are extensively used in applications involving area mapping [1], object tracking [2], obstacle avoidance [3], [4], [5] and other range-critical tasks. They usually have lower angular resolution compared to cameras. However, cameras, despite their generally good angular resolution, cannot easily capture range information. This is because monocular cameras discard range information during the imaging process. When multiple cameras are used to retrieve range information through methods like triangulation, the range quality is typically lower compared to LiDAR, particularly in large outdoor scenarios. Consequently, the fusion of 3D LiDAR data and 2D camera data has become an attractive solution in various applications, leveraging the complementary strengths of cameras and LiDAR sensors [6], [7].