I. Introduction
Rolling bearing is a kind of precise mechanical element running between shaft and shaft seat, which can transform sliding friction into rolling friction to reduce friction loss. Rotating machinery is very complex, usually working in high-temperature, high-pressure, and high-speed rotating environments. High-intensity motion and friction loss make rolling bearings vulnerable to damage [1], [2], [3]. Therefore, it is necessary to continuously monitor and diagnose the health status of rolling bearings. In the field of intelligent fault diagnosis, traditional diagnostic methods, such as artificial neural network (ANN) [4], support vector machine (SVM) [5], and extreme learning machine (ELM) [6], [7], need to use signal processing techniques to extract effective features from the original diagnostic signals. However, in actual cases, the fault vibration signal is usually nonstationary and has strong noise, which makes it difficult to design an effective feature extraction method manually. The bearing fault diagnosis method based on deep learning (DL) has made remarkable achievements with its automatic feature extraction ability and end-to-end approach in recent years [8], [9], [10]. Zhang et al. [11] designed an adaptive activation function with a slope and threshold of the tanh function (STAC-tanh) and combined the STAC-tanh with the deep residual network to achieve adaptive extraction of effective fault features. Shen et al. [12] proposed a deep multilabel learning framework called multilabel convolutional neural network (MLCNN), which can use missing label samples for network training by learning relevant features. Based on a simple spectrum matrix obtained by short-time Fourier transform (STFT), He and He [13] established an optimized DL structure, large memory storage retrieval (LAMSTAR) neural network, to realize bearing fault diagnosis.