Learning with Noisy Labels via Sparse Regularization | IEEE Conference Publication | IEEE Xplore

Learning with Noisy Labels via Sparse Regularization


Abstract:

Learning with noisy labels is an important and challenging task for training accurate deep neural networks. Some commonly-used loss functions, such as Cross Entropy (CE),...Show More

Abstract:

Learning with noisy labels is an important and challenging task for training accurate deep neural networks. Some commonly-used loss functions, such as Cross Entropy (CE), suffer from severe overfitting to noisy labels. Robust loss functions that satisfy the symmetric condition were tailored to remedy this problem, which however encounter the underfitting effect. In this paper, we theoretically prove that any loss can be made robust to noisy labels by restricting the network output to the set of permutations over a fixed vector. When the fixed vector is one-hot, we only need to constrain the output to be one-hot, which however produces zero gradients almost everywhere and thus makes gradient-based optimization difficult. In this work, we introduce the sparse regularization strategy to approximate the one-hot constraint, which is composed of network output sharpening operation that enforces the output distribution of a net-work to be sharp and the ℓp-norm (p ≤ 1) regularization that promotes the network output to be sparse. This simple approach guarantees the robustness of arbitrary loss functions while not hindering the fitting ability. Experimental results demonstrate that our method can significantly improve the performance of commonly-used loss functions in the presence of noisy labels and class imbalance, and out-perform the state-of-the-art methods. The code is available at https://github.com/hitcszx/lnl_sr.
Date of Conference: 10-17 October 2021
Date Added to IEEE Xplore: 28 February 2022
ISBN Information:

ISSN Information:

Conference Location: Montreal, QC, Canada

Funding Agency:


1. Introduction

Deep neural networks (DNNs) have achieved remarkable success on various computer vision tasks, such as image classification, segmentation, and object detection [7]. The most widely used paradigm for DNN training is the end-to-end supervised manner, whose performance largely relies on massive high-quality annotated data. However, collecting large-scale datasets with fully precise annotations (or called clean labels) is usually expensive and time-consuming, and sometimes even impossible. Noisy labels, which are systematically corrupted from ground-truth labels, are ubiquitous in many real-world applications, such as online queries [19], crowdsourcing [1], adversarial at-tacks [30], and medical images analysis [12]. On the other hand, it is well-known that over-parameterized neural net-works have enough capacity to memorize large-scale data with even completely random labels, leading to poor performance in generalization [28], [1], [11]. Therefore, robust learning with noisy labels has become an important and challenging task in computer vision [25], [8], [12], [31].

Contact IEEE to Subscribe

References

References is not available for this document.