1. Introduction
Video frame interpolation (VFI) is an important direction in current computer vision research and finds widespread applications in various domains, including slow-motion generation [1], [16], [46], and video compression [44]. It is primarily used to generate non-existent intermediate frames in video sequences, enabling various effects such as video smoothing and high frame rate conversion. Existing VFI methods mostly employ motion-based strategies, estimating pixel-level motion from keyframes and using warping techniques to obtain interpolated frames. However, these methods have certain limitations when dealing with occlusion and non-linear motion, often struggling to accurately predict motion in complex scenes, thus affecting the quality of interpolated frames.
Qualitative comparison of the occlusion handling. (a) and (b) estimate occlusion mappings by frames, (c) use events for occlusion judgments, and (d) make occlusion judgments for optical flow at the feature level. Our method (e) achieves the best results by no longer estimating the occlusion mapping but giving a direct structural reference.