1. Introduction
The development of object recognition is carried by two pillars: large-scale data annotation and deep neural networks. With new applications coming out every day, re-searchers need to constantly develop new methods and create new datasets. While we are able to develop novel neural networks for new tasks, the creation of new datasets can hardly keep up due to its huge cost. In the literature, a diverse set of learning paradigms, such as self-learning [13], semi-supervised learning [17] and transfer learning [6], have been developed to come to the rescue. We enrich this repository by developing a method to combine multiple existing datasets that have been annotated in different domains, for smaller-scale tasks (fewer classes), and/or with fewer data modalities. The importance of the method can be justified by the fact that as time goes, research goals will become more and more ambitious, so object recognition models for more classes, new domains, and/or more data modalities are necessary.