I. Introduction
Recently, vision foundation models (VFMs) have emerged and gained great research focus in the field of computer vision. Utilizing the knowledge gained in meta-scale datasets, foundation models, such as segment anything model (SAM) [1] and its variants (e.g., FastSAM [2] and Mobile SAM [3]) are able to recognize visual contents in a training-free manner and obtain fine-grained semantic masks. Their strong generalization across different imaging conditions and visual objects promotes greatly the applications in real-world scenarios. However, due to the inductive bias learned in natural images, foundational models exhibit limitations when applied to images in some specific domains, such as medical images and remote sensing images (RSIs). According to the literature survey [4], SAM pays more attention to the foreground objects and often fails to segment small and irregular objects. In this article, we focus on adapting SAM to improve one of the fundamental tasks in RS, i.e., change detection (CD) of very high-resolution (VHR) RSIs.