Conferences >2022 IEEE/CVF Conference on C...

DASO: Distribution-Aware Semantics-Oriented Pseudo-label for Imbalanced Semi-Supervised Learning

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

The capability of the traditional semi-supervised learning (SSL) methods is far from real-world application due to severely biased pseudo-labels caused by (1) class imbal...Show More

Metadata

Abstract:

The capability of the traditional semi-supervised learning (SSL) methods is far from real-world application due to severely biased pseudo-labels caused by (1) class imbalance and (2) class distribution mismatch between labeled and unlabeled data. This paper addresses such a relatively under-explored problem. First, we propose a general pseudo-labeling framework that class-adaptively blends the semantic pseudo-label from a similarity-based classifier to the linear one from the linear classifier, after making the observation that both types of pseudo-labels have complementary properties in terms of bias. We further introduce a novel semantic alignment loss to establish balanced feature representation to reduce the biased predictions from the classifier. We term the whole framework as Distribution-Aware Semantics-Oriented (DASO) Pseudo-label. We conduct extensive experiments in a wide range of imbalanced benchmarks: CIFAR10/100-LT, STL10-LT, and large-scale long-tailed Semi-Aves with open-set class, and demonstrate that, the proposed DASO framework reliably improves SSL learners with unlabeled data especially when both (1) class imbalance and (2) distribution mismatch dominate.

Published in: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Date of Conference: 18-24 June 2022

Date Added to IEEE Xplore: 27 September 2022

ISBN Information:

ISSN Information:

DOI: 10.1109/CVPR52688.2022.00956

Conference Location: New Orleans, LA, USA

Contents

1. Introduction

Semi-supervised learning (SSL) [7] has shown to be promising for leveraging unlabeled data to reduce the cost of constructing labeled data [4], [5], [36], [40], [57] and even boost the performance at scale [29], [49], [67], [68]. The common approach of these algorithms is to produce pseudo-labels for unlabeled data based on model's predictions and utilize them for regularizing model training [29], [38], [57]. Although adopted in a variety of tasks, these algorithms often assume class-balanced data, while many real-world datasets exhibit long-tailed distributions [3], [18], [3]1,[3]2. With classimbalanced data, the class distribution of pseudo-labels from unlabeled data becomes severely biased to the majority classes due to confirmation bias [2]. Such biased pseudo-labels can further bias the model during training. Figure 1.

Glimpse of the DASO framework. DASO reduces the overall bias in pseudo-labels (PL) from unlabeled data by blending two complementary pls from different classifiers. Note that bias is conceptually illustrated as relative PL size (rel. PL size), meaning that pseudo-label size is normalized by actual label size.

References is not available for this document.

MIT Libraries

MIT Libraries

DASO: Distribution-Aware Semantics-Oriented Pseudo-label for Imbalanced Semi-Supervised Learning

Abstract:

Metadata

Abstract:

ISSN Information:

1. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

MIT Libraries

MIT Libraries

DASO: Distribution-Aware Semantics-Oriented Pseudo-label for Imbalanced Semi-Supervised Learning

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

1. Introduction

References