1. Introduction
Despite the remarkable success of deep neural networks (DNNs) these days, many areas of computer vision suffer from highly imbalanced datasets. Many real-world data exhibit skewed distributions [23], [16], [11], [24], [10], in which the number of samples per class differs greatly. This imbalance between classes can be problematic, since the model trained on such imbalanced data tends to overfit the dominant (majority) classes [18], [14], [4]. That is, while the overall performance appears to be satisfactory, the model performs poorly on minority classes. To overcome the class imbalance problem, extensive research has recently been conducted to improve the generalization performance by reducing the overwhelming influence of the dominant class on the model.