Conferences >2024 IEEE/CVF Conference on C...

Enhancing Video Super-Resolution via Implicit Resampling-based Alignment

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

In video super-resolution, it is common to use a frame-wise alignment to support the propagation of information over time. The role of alignment is well-studied for low-l...Show More

Metadata

Abstract:

In video super-resolution, it is common to use a frame-wise alignment to support the propagation of information over time. The role of alignment is well-studied for low-level enhancement in video, but existing works overlook a critical step - resampling. We show through extensive experiments that for alignment to be effective, the resam-pIing should preserve the reference frequency spectrum while minimizing spatial distortions. However, most ex-isting works simply use a default choice of bilinear inter-polation for resampling even though bilinear interpolation has a smoothing effect and hinders super-resolution. From these observations, we propose an implicit resampling-based alignment. The sampling positions are encoded by a sinusoidal positional encoding, while the value is es-timated with a coordinate network and a window-based cross-attention. We show that bilinear interpolation inher-ently attenuates high-frequency information while an MLP-based coordinate network can approximate more frequen-cies. Experiments on synthetic and real-world datasets show that alignment with our proposed implicit resampling enhances the performance of state-of-the-art frameworks with minimal impact on both compute and parameters.

Published in: 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Date of Conference: 16-22 June 2024

Date Added to IEEE Xplore: 16 September 2024

ISBN Information:

ISSN Information:

DOI: 10.1109/CVPR52733.2024.00246

Conference Location: Seattle, WA, USA

Funding Agency:

Contents

1. Introduction

Video super-resolution (VSR) recovers a high spatial reso-lution sequence of frames from a low-resolution sequence. While image super-resolution can be applied naively to each frame individually, the temporal correlations across the frames give an extra source of information to im-prove the super-resolved output. As such, the main differ-ence in video versus image super-resolution architectures lies in the use of temporal dependencies. Previous works [2], [9], [2]6, [2]8 have shown that spatial alignment is an essen-tial pre-processing step for effective information exchange across the frames. Given the frame-to-frame camera and object motions, alignment provides indications of sub-pixel information which can benefit the super-resolution.

References is not available for this document.

Enhancing Video Super-Resolution via Implicit Resampling-based Alignment

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

1. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Enhancing Video Super-Resolution via Implicit Resampling-based Alignment

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

1. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?