I. Introduction
Compared to traditional static remote sensing images, video satellite provides continuous information to a specific area, which is crucial for dynamic earth observation. Therefore, it has been widely applied to dynamic scene applications such as change detection [1], object tracking [2], and traffic monitoring [3]. However, the spatial resolution of satellite video is usually contaminated by the complex aerial environment, limited by the intrinsic resolution of satellite video sensors (~1m), and degraded by data compression. Consequently, high-frequency information in the satellite video may be lost, which dramatically reduces the visual quality and degrades performance in subsequent applications. To this end, it is essential to improve the spatial resolution of satellite videos.