Abstract:
The accuracy of a classifier, whether it is an ensemble or not, is directly influenced by the training data used in learning. In remote sensing, training data mislabeling...Show MoreMetadata
Abstract:
The accuracy of a classifier, whether it is an ensemble or not, is directly influenced by the training data used in learning. In remote sensing, training data mislabeling is inevitable and faces a major challenge. This article proposes a versatile data cleaning, which handles the mislabeling problem by exploiting the ensemble concepts for identifying and then eliminating or correcting the mislabeled training data. A powerful ensemble method, random forest (RF), is at the core of our filter design and helps to distinguish mislabeled data from uncorrupted data more accurately. The major contribution of this work lies on the explicit use of the hypothesis margin as a decision means to identify and eliminate or correct mislabeled training data in an ensemble learning framework. Another key development that makes our algorithm superior to existing approaches is a design that avoids rare class instances to be mistaken for class noise. This fundamental aspect makes our data cleaning system particularly suitable for remote sensing classification tasks, which usually suffer from both mislabeling and imbalance problems. The effectiveness of our algorithm is demonstrated in performing mapping of land covers. The generalization performance of two major supervised noise-sensitive classifiers, boosting and K -nearest neighbors (KNNs), is strengthened by effective class noise reduction. A comparative analysis is conducted with respect to RF, deep convolutional neural networks (CNNs), and two well-established ensemble-based class noise filters, the majority vote and the consensus vote filters. This analysis demonstrates that our approach is more accurate than deep CNNs (1-D CNN, AlexNet, EfficientNet, ResNet50, and ShuffletNet) and the reference ensemble methods.
Published in: IEEE Transactions on Geoscience and Remote Sensing ( Volume: 61)