I. Introduction
Pedestrian detection is a specialized form of object detection that has gained increasing interest in computer vision and multimedia analysis [1], [2], [3], [4], [5], [6], [7], [8], [9]. It is crucial for applications like autonomous driving systems [1], [2], [3], [4], human-robot interaction [5], [6], and intelligent video surveillance [7], [8], [9]. However, pedestrian detection in crowded scenes is more challenging than general object detection due to diverse appearances, scale variations, and occlusions. Especially, the detection performance for heavily occluded samples still remains unsatisfactory, even many Convolutional Neural Networks (CNNs) have been applied to handle the issue.