I. Introduction
The demand for new technologies in the operation and maintenance (O&M) of wind turbines (WTs) has grown in recent years. It is crucial to monitor the conditions of WT by analyzing the operational data to detect incipient faults and reduce O&M costs [1]. Methods based on supervisory control and data acquisition (SCADA) data have received a lot of attention for condition monitoring of WT [2], [3], [4]. However, there is an imbalance problem with SCADA data [5], meaning that there is much less fault data than normal data. The reasons for this include WT not being allowed to operate for a long time under fault conditions [6], inefficient manual labeling, and so on. Therefore, fault detection (FD) algorithms based on normal behavior modeling [7], [8] have been widely used. These FD models are trained using only normal data and can detect faults by analyzing the differences between current and normal conditions.