I. Introduction
In machine learning, incomplete data is a big problem. There are many possibilities that can cause the training data to be incomplete, such as mislabeling, biases, omissions, non-sufficiency, imbalance, noise, outliers, etc. This paper mainly tackles the outlier problem. An outlier is a pattern that was either mislabeled in the training data, or inherently ambiguous and hard to recognize. In the course of collecting training data, two circumstances can occur, one is the absence of information that may truly represent the pattern, while the other is the presence of additional information that may not be relevant to the patterns to be recognized. The former addresses the problem of signal collection, feature extraction or feature selection, etc., while the latter deals with noise and outlier problems.