I. Introduction
Over the last decade, Deep Learning (DL) has achieved tremendous success in various fields, such as agriculture [1], medical imaging [2], robotics [3], etc. Despite this success, traditional DL models underperform when faced with domain shift, i.e., the train and the test set are no longer identically distributed, which is often encountered in the real world [4]. To address this scenario, different methods have been developed over the last few years under the framework of Domain Adaptation (DA) [5]. The main idea behind DA is to adapt the model trained on the labeled source domain to minimize the generalization error on the unlabeled target domain [6]. While conventional DA models typically work with single-source and single-target domains, in many real-world applications, the data is often collected from multiple related domains, such as images taken under different conditions (e.g., lighting, pose, etc.) [7]. This has led to the more practical and challenging Multi-Source Domain Adaptation (MDA) framework, which aims to transfer knowledge from multiple labeled source domains to the unlabeled target domain.