I. Introduction
Among the existing anomaly detection methods, data-driven methods have become a mainstream means. Of these, supervised learning methods, in which all types of data in the training set are labeled, can lead to significant improvements in the accuracy of anomaly detection. Therefore, supervised learning classification algorithms have been widely used in the field of anomaly detection. However, in many real-world detection problems, the amount of anomalous data is often much less than normal data, which leads to the problem of data imbalance, i.e., the existence of majority and minority classes in the dataset with significantly different sample sizes [1]. In the classification domain based on intelligent algorithms, modeling imbalanced data sets is a frequent challenge in training models. If the imbalanced data between different classes are modeled directly, the features selected by the traditional feature selection method tend to be more inclined to the majority classes than that of the minority classes, and the trained model will overfit the majority of classes, leading to the biased prediction in the model.