I. Introduction
Vibration monitoring in its most basic form employs statistical features derived from the raw vibration waveform to train conventional machine learning models (e.g. k-nearest neighbour, decision tree, random forest, support vector machine, and so on) for fault recognition [1]. The performance of the downstream model relies heavily on the quality of features extracted. Gearbox vibration signals are intricate, nonstationary, and nonlinear. When extracted from sensors mounted on real-world wind turbines, the signal is further complicated due to noise. Consequently, time-domain features extracted directly from the raw vibration waveform are not sufficiently discriminative to distinguish a faulty gearbox component from a healthy one. To solve this problem, advanced signal processing algorithms including discrete wavelets transform, continuous wavelets transform, variational mode decomposition, empirical mode decomposition, among others are used to extract meaningful features from the raw vibration waveform. The latter is then fed to the downstream model for fault recognition. In recent years, the use of empirical mode decomposition (EMD) in hybrid with conventional machine learning models has garnered significant interest.