I. Introduction
The methods to solve the problem of small samples of power grid data are roughly divided into: semi-supervised learning, transductive reasoning learning, active learning and virtual samples[1]. Among them, semi-supervised learning has the advantage of relatively easy sample acquisition, and its main idea is to combine unlabeled samples to assist Labeled training samples are used for learning, and the distribution information of unlabeled samples is used to improve the generalization ability of the model[2]. Active learning is to actively query and label the most important samples to assist the classification learning of the original samples[3]. The difficulty lies in how to determine and label The importance of the sample[4]. The virtual sample method generates new samples by adopting certain assumptions on the characteristics of the samples, which is an intuitive method to reduce the impact of the small sample problem and achieve the purpose of expanding the original sample set[5].