1. Introduction
Deep learning has achieved remarkable success on various vision tasks, including image recognition [16], [12], [14] and semantic segmentation [55], [49]. However, the recent success of deep learning methods heavily relies on massive labeled data. In practice, collecting abundant annotated data is expensive [30], [6], [54], [33]. Meanwhile, each domain has its own specific exploratory factors, namely semantics, e.g., the illuminations, colors, visual angles or backgrounds, resulting in the domain shift [28]. Hence, traditional deep models trained on a large dataset usually show poor generalizations on a new domain due to the domain shift issues [24], [10]. To remedy this, one appealing alternative is domain adaptation (DA), which strives to leverage the knowledge of a label-rich source domain to assist the learning in a related but unlabeled target domain [24], [10].