I. Introduction
Image inpainting refers to the process of reconstituting damaged regions using the known information of images in a visually plausible manner. Driven by the rapid development of satellite technology, remote sensing (RS) images processing has attracted extensive attention from both academia and industry. Although many techniques have been developed for missing data reconstruction in RS imagery, it is still difficult due to the complex reconstruction scenarios, as shown in Fig. 1. These problems obscure land surface features, hampering subsequent image processing such as classification, detection, and segmentation [1]–[4]. Hence, recovering the missing information of RS imagery remains an urgent task. Most of the existing missing data reconstruction methods in RS images [5]–[7] utilize the high correlations or regular fluctuations between different sources of data, so as to repair the image. These contemporary approaches are mainly designed as a single-task solution with specialized architectures or loss functions, limiting their generalization ability in various scenes. Besides, these above models require other forms of data as supplementary input (e.g., spatial, temporal, or spectral images), which may be unavailable under certain circumstances. Very recently, a few universal reconstruction methods [8], [9] have been presented. However, these approaches are also data-intensive, while the additional images are not always acquirable and the fusion of multisource inputs requires tricky design. Nevertheless, these methods reconstruct the images from the respective feature extraction and fusion, which usually fail to learn reasonable feature representation when dealing with large-region missing. To sum up, contemporary state-of-the-art (SOTA) reconstruction methods either struggle with limited missing scenarios or require multisource information as supplementary inputs.
Exemplar reconstruction results of our method on versatile scenarios in RS images. From left to right, we show the corrupted input image, the extracted mask, and reasonable outputs of our model without any postprocessing.