I. Introduction
Three-dimensional object detection is a critical task in autonomous driving, promoting the advances of intelligent transportation systems and has gained widespread attention [1], [2], [3], [4], [5], [5], [6], [7], [8], [9], [10], [11], [12], [13], [14], [15], [16], [17], [18]. With the availability of various sensor modalities, such as cameras and LiDAR, significant progress has been made in single-modal 3-D object detection using either camera images [19], [20], [21], [22], [23] or LiDAR point clouds [1], [1], [2]. Compared to the image data provided by cameras, LiDAR point clouds offer accurate depth and position information and have led to extensive research in recent years [24], [25], [26], [27], [28], [29], [30], [31].