Conferences >2023 IEEE/CVF International C...

Stable and Causal Inference for Discriminative Self-supervised Deep Visual Representations

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

In recent years, discriminative self-supervised methods have made significant strides in advancing various visual tasks. The central idea of learning a data encoder that ...Show More

Metadata

Abstract:

In recent years, discriminative self-supervised methods have made significant strides in advancing various visual tasks. The central idea of learning a data encoder that is robust to data distortions/augmentations is straightforward yet highly effective. Although many studies have demonstrated the empirical success of various learning methods, the resulting learned representations can exhibit instability and hinder downstream performance. In this study, we analyze discriminative self-supervised methods from a causal perspective to explain these unstable behaviors and propose solutions to overcome them. Our approach draws inspiration from prior works that empirically demonstrate the ability of discriminative self-supervised methods to demix ground truth causal sources to some extent. Unlike previous work on causality-empowered representation learning, we do not apply our solutions during the training process but rather during the inference process to improve time efficiency. Through experiments on both controlled image datasets and realistic image datasets, we show that our proposed solutions, which involve tempering a linear transformation with controlled synthetic data, are effective in addressing these issues.

Published in: 2023 IEEE/CVF International Conference on Computer Vision (ICCV)

Date of Conference: 01-06 October 2023

Date Added to IEEE Xplore: 15 January 2024

ISBN Information:

ISSN Information:

DOI: 10.1109/ICCV51070.2023.01476

Conference Location: Paris, France

Contents

1. Introduction

Learning generalized representation with unlabeled data is a challenging task in various fields, but Self-Supervised Learning (SSL) has recently demonstrated remarkable success in learning semantic invariant representations without labels [40], [41], [53]. There are two main types of self-supervised learning (SSL) based on the pretext task used: generative and discriminative SSL, with generative SSL reconstructing altered or distorted data to its original input [9], [28], [31], [59], [65], [71] and early discriminative SSL predicting easily designed labels and task-specific representations that are not very generalizable [25], [57], [75]. More recent discriminative SSL trains the model to identify similarities and differences between pairs of augmented examples [7], [10], [11], [26], [29], [74]. The success of SSL in deep image models has resulted in progress in other data modalities [53], [52], [54], [61], [62] and attention-based models like transformers [12], [8], [49], [72]. Recent discriminative SSL aims to learn content and semantic invariant representations that are robust to data augmentations, but the learned representations can be unstable when one subtle factor of the data is changed to a value that is not accessible through all augmentations. To avoid the high cost of incorporating all possible subtle changes during training, insights are needed to uncover the root cause of instability and find a solution to prevent performance deterioration during inference. Figure 1 summarizes this deterioration effect.

References is not available for this document.

Stable and Causal Inference for Discriminative Self-supervised Deep Visual Representations

Abstract:

Metadata

Abstract:

ISSN Information:

1. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Stable and Causal Inference for Discriminative Self-supervised Deep Visual Representations

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

1. Introduction

References