I. Introduction
Deep learning has played a major role in remote sensing image (RSI) interpretation [1], [2], [3], [4], [5], [6], [7], [8]. Many convolutional neural network (CNN)-based models, such as ResNet-50 [9], often rely on pretraining the models on ImageNet [10]. The inherent domain gap in data characteristics and imaging mechanisms between natural images and RSIs limit the performance of these models. Although some works [11], [12], [13], [14] have explored pretraining specifically on RSIs, they require diverse sources of RSIs and there is still no commonly accepted pretraining benchmark like ImageNet in the RSI community. On the other hand, pretraining on a large-scale dataset requires significant computing resources, making it only feasible in a limited number of research institutions. This article aims to generate competitive pretrained models on randomly sampled small-scale RSIs and with limited computing resources, e.g., four NVIDIA V100 GPUs.