I. Introduction
With the rapid development of deep learning, intelligent fault diagnosis (IFD) has got a number of achievements in recent years [1]–[3]. The successes of IFD are subjected to a common assumption: there are sufficient labeled data to train reliable diagnosis models [4], [5]. In engineering scenarios, however, it is difficult to collect sufficient labeled data because of the huge human labor in labeling data. Consequently, the unlabeled data from real-case machines may not train the diagnosis models that are able to provide accurate results.