I. Introduction
The ITS is a typical complex system that integrates various visual perception tasks, including object detection [1], [2], object tracking [3], [4], and object segmentation [5]. Deep learning-based visual perception models are extensively applied in ITS, playing a pivotal role in ensuring its safe and stable development.