1. Introduction
Self-supervised learning is an important research area whose goal is to learn superior data representations without any labelled supervision. Recently, self-supervised contrastive learning has shown promising results in image classification tasks [21], [7], [4]. In the contrastive learning paradigm, a model is trained to recognize different augmentations of the same image (commonly referred as positives) while discriminating them from other random images (referred as negatives) in the dataset. The promising performance of self-supervised contrastive learning led to the idea of leveraging label information in the contrastive learning paradigm. To this end, Khosla et al. [30] proposed a supervised contrastive learning framework that achieves better ImageNet accuracy than the standard cross-entropy model.