I. Introduction
The fault diagnosis of bearings is crucial to ensure the reliability of rotating machinery. With the development of Internet of Things technology and information science, data-driven fault diagnosis and condition monitoring technologies have received extensive attention [1]. As a type of bottom-up modeling method that can directly learn fault-related information from the monitoring data, the data-driven bearing fault diagnosis model provides considerable advantages in terms of adaptability, robustness, and adjustability. The two main tasks for evaluating the discrimination of different fault locations (on the inner ring, outer ring, roller, cage, etc.) and identification of the fault severities related to the physical sizes of defects [2]. Most of the data-driven methods handle bearing fault diagnosis tasks as common classification tasks using machine learning methods, such as the k-nearest neighbor (KNN) algorithm [3], support vector machine (SVM) [4], hidden Markov model [5], echo state network [6], autoencoder [7], generative adversarial network [8], recurrent neural network [9], and convolutional neural network (CNN) [10]. In general, these methods significantly dependent on labeled data and assume that the training and test data have the same distribution. This assumption is generally difficult to satisfy in actual cases owing to the high cost and difficulty associated with the collection of sufficient labeled data corresponding to all the fault severities of bearings, which limits the applicability of data-driven models.