I. Introduction
Deep neural networks (DNNs) achieve the state-of-the-art performance in many tasks [1]–[5]. Since deep neural networks are driven by data, biased data inevitably cause biased models, resulting in poor generalization for test domains [6]. To confront with the bias, unbiased training is proposed to directly compensate the bias effect, e.g., jitter or flip the images for data augmentation [7], [8], batch normalization for stable mean and variance [9], neuron dropout for robust features [10], and re-weighting for balanced sample loss [11]–[13], just to name a few. Meanwhile, we do find that some of the bias types, such as visual contexts, are essentially good for different tasks [14], [15], e.g., an image with a highway definitely increases the probability of car or truck and decreases that of lion or fish. In fact, there are evidences showing that removing such “good” bias indeed hurts the model performance [16], as the “good” bias has a high probability to appear in test cases. However, how to distinguish the “good” from the “bad” at the training stage still remains open.