1. Introduction
Video super-resolution (VSR) aims to recover a high-resolution (HR) video from a low-resolution (LR) counter-part [39]. As a fundamental task in computer vision, VSR is usually adopted to enhance visual quality, which has great value in many practical applications, such as video surveillance [48], high-definition television [10], and satellite imagery [6], [27], etc. From a methodology perspective, unlike image super-resolution that usually learns on spatial dimensions, VSR tasks pay more attention to exploiting temporal information. In Fig. 1, if detailed textures to recover the target frame can be discovered and leveraged at relatively distant frames, video qualities can be greatly enhanced.
A comparison between TTVSR and other SOTA methods: MuCAN [24] and IconVSR [4]. We introduce finer textures for recovering the target frame from the boxed areas (indicated by yellow) tracked by the trajectory (indicated by green).