I. Introduction
Vehicle re-identification (Re-ID) aims to identify the same vehicle captured by different cameras. It can be used for vehicle cross-camera tracking and vehicle trajectory mining, and has great potential value in intelligent transportation and public security management due to the massive deployment of surveillance cameras on the city road. Previous research [1], [2], [3], [4], [5] focus on supervised vehicle Re-ID, which involves model training and testing in annotated datasets from the same domain, and have achieved good performance. Unfortunately, due to the domain gap between the training datasets and actual application scenarios [6], [7], [8], [9], models trained on public datasets cannot be directly applied in practice since the performance will be greatly degraded. A large amount of application scenario data must be collected and annotated with vehicle identities to obtain a practical model, which is impractical. On the one hand, since most surveillance videos are not clear enough to recognize the plate number, it’s time-consuming to label vehicle identity based on appearance alone. On the other hand, this approach does not enable the effective utilization of the massive data generated by surveillance cameras every day. Therefore, the unsupervised Re-ID method that can learn discriminative representations without annotation is more practical and has recently attracted increasing attention.