1. Introduction
Video frame interpolation, the synthesis of intermediate frames between existing frames of a video, is an important technique with applications in frame-rate conversion [33], video editing [31], novel view interpolation [21], video compression [59], and motion blur synthesis [5]. While the performance of video frame interpolation approaches has seen steady improvements, research efforts have become increasingly complex. For example, DAIN [3] combines optical flow estimation [51], single image depth estimation [26], context-aware image synthesis [35], and adaptive convolutions [37]. However, we show that somewhat surprisingly, it is possible to achieve near state-of-art results with an older, simpler approach by carefully optimizing its individual parts. Specifically, we revisit the idea of using adaptive separable convolutions [38] and augment it with a set of intuitive improvements. This optimized SepConv++ ranks second among all published methods in the Middlebury benchmark [1].