Self-Supervised Learning Across Domains | IEEE Journals & Magazine | IEEE Xplore

Abstract:

Human adaptability relies crucially on learning and merging knowledge from both supervised and unsupervised tasks: the parents point out few important concepts, but then ...Show More

Abstract:

Human adaptability relies crucially on learning and merging knowledge from both supervised and unsupervised tasks: the parents point out few important concepts, but then the children fill in the gaps on their own. This is particularly effective, because supervised learning can never be exhaustive and thus learning autonomously allows to discover invariances and regularities that help to generalize. In this paper we propose to apply a similar approach to the problem of object recognition across domains: our model learns the semantic labels in a supervised fashion, and broadens its understanding of the data by learning from self-supervised signals on the same images. This secondary task helps the network to focus on object shapes, learning concepts like spatial orientation and part correlation, while acting as a regularizer for the classification task over multiple visual domains. Extensive experiments confirm our intuition and show that our multi-task method, combining supervised and self-supervised knowledge, provides competitive results with respect to more complex domain generalization and adaptation solutions. It also proves its potential in the novel and challenging predictive and partial domain adaptation scenarios.
Published in: IEEE Transactions on Pattern Analysis and Machine Intelligence ( Volume: 44, Issue: 9, 01 September 2022)
Page(s): 5516 - 5528
Date of Publication: 02 April 2021

ISSN Information:

PubMed ID: 33798074

Funding Agency:


1 Introduction

Many definitions of intelligence have been formulated by psychologists and learning researches along the years. Despite the differences, they all indicate the ability to adapt and achieve goals under a wide range of conditions as a key component [1]. Artificial intelligence inherits these definitions, with the most recent research demonstrating the importance of knowledge transfer and domain generalization [18]. Indeed, in many practical applications the underlying distributions of training (i.e., source) and test (i.e., target) data are inevitably different, asking for robust and adaptable solutions. When dealing with visual domains, most of the current strategies are based on supervised learning. These processes search for semantic spaces able to capture basic data knowledge regardless of the specific appearance of input images: some decouple image style from the shared object content [7], others generate new samples [75], or impose adversarial conditions to reduce feature discrepancy [46], [48]. With the analogous aim of getting general purpose feature embeddings, an alternative research direction is pursued by self-supervised learning that captures visual invariances and regularities solving tasks that do not need data annotation, like image orientation recognition [30] or image coloring [84]. Unlabeled data are largely available and by their very nature are less prone to bias (no labeling bias issue [72]), thus they seem the perfect candidate to provide visual information independent from specific domain styles. However their potential has not been fully exploited: the existing self-supervised approaches often come with tailored architectures that need dedicated fine-tuning strategies to re-engineer the acquired knowledge [60]. Moreover, they are mainly applied on real-world photos without considering cross-domains scenarios with images of paintings or sketches.

References

References is not available for this document.