I. Introduction
Person re-identification (re-ID) aims to associate the images of the identical person captured by different cameras. Due to the urgent demand of public safety and increasing number of surveillance cameras in university campuses, theme parks, transportation roads, etc, the development of this technology is conducive to intelligent transportation [4], [5] and video surveillance system design [6], [7]. It is a quite challenging task due to the complex intra-class and inter-class variations caused by illumination change, camera viewpoint shift, human pose variation, occlusion, etc. With the incredible power of deep neural networks, the performance of fully supervised person re-ID [8], [9], [10], [11] on public benchmarks [2], [12], [13] has made significant progress. However, in practical application scenarios, supervised person re-ID suffers from the lack of sufficient labelled training data due to the high cost of manual annotation. Recently, researchers pay more attention to the unsupervised person re-ID and the unsupervised domain adaptive (UDA) person re-ID [3], [14], [15], [16], [17], [18], [19], [20], [21]. The former task only utilizes the unlabeled data collected in the target scene to train the re-ID model. The latter task jointly utilizes the unlabeled data in the target domain and labelled data from the source domain to improve the person re-ID performance.