1. Introduction
Remote sensing (RS) imagery is a valuable data source for Earth Observation to measure different physical properties on earth. Compared to natural images, RS imagery consists of various wavelengths and provides more information. As a result, understanding massive and complex RS multispectral images is critical and more challenging. There are some reasons why satellite image processing is more challenging: (1) big intraclass diversity, (2) high interclass similarity, (3) large variance of object/scene scales, and (4) coexistence of multiple ground objects [1]. Recently, many RS tasks have been carried out by exploiting optical multispectral images from Sentinel-2, which perform very well when cloud coverage is minimal. When the weather is cloudy, this technique is not viable since cloud removal itself is a challenging task [2]