1. Introduction
Deep neural networks have achieved unprecedented success in a myriad of vision tasks over the past decade. Despite the promise, a well-trained model deployed in the open and ever-changing world often struggles to deal with the domain shifts—the training and testing data do not follow the independent and identically distributed (i.i.d) assumption, and therefore deteriorates its safety and reliability in many safety-critical applications, such as autonomous driving and computer-aided disease diagnosis. This gives rise to the importance of Domain Generalization (DG) [101], [83], a.k.a. out-of-distribution (OOD) generalization, which aims at generalizing predictive models trained on multiple (or a single) source domains to unseen target distributions.