I. Introduction
Training data (TD) play a vital role in data-driven approaches for remote sensing image interpretation using artificial intelligence (AI) machine learning (ML) and deep learning (DL). As a result, substantial efforts have been made by EO experts, research teams, and organizations, to generate massive training datasets for various image interpretation tasks [1–6], including scene classification, object detection, semantic segmentation, change detection and 3D model reconstruction. These publicly available training datasets are valuable assets for AI researches and applications in Earth observation (EO) domain. Although many datasets are provided to the public in an open source manner, the representation and organization of these data are different, which lead to heterogeneity and difficulty for dataset sharing and interoperability [7–8].