Conferences >2024 IEEE International Confe...

Deep Reference Frame for Versatile Video Coding with Structural Re-parameterization

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

In video coding, inter-prediction leverages neigh-boring frames to reduce temporal redundancy. The quality of these reference frames is essential for effective inter-pred...Show More

Metadata

Abstract:

In video coding, inter-prediction leverages neigh-boring frames to reduce temporal redundancy. The quality of these reference frames is essential for effective inter-prediction. Although many neural network-based methods have been proposed to improve the quality of reference frames, there is still room for the performance and efficiency trade-off. In this paper, we propose an interpolation diverse branch block (InterDBB) suitable for lightweight frame interpolation networks, which optimizes deep reference frame interpolation networks to improve performance without sacrificing speed and increasing complexity. Specifically, we propose a multi-branch structural reparameterization block without batch normalization. This straightforward yet effective modification ensures training stability and performance improvement. Moreover, we propose a parameterized motion estimation strategy based on different input resolution, to achieve a better trade-off between performance and computational complexity. Experimental results demonstrate that our method achieves -2.01%/-2.87%/-2.44% coding efficiency improvements for Y/U/V components under random access (RA) configuration compared to VTM-11.0_NNVC-5.0.

Published in: 2024 IEEE International Conference on Visual Communications and Image Processing (VCIP)

Date of Conference: 08-11 December 2024

Date Added to IEEE Xplore: 27 January 2025

ISBN Information:

ISSN Information:

DOI: 10.1109/VCIP63160.2024.10849929

Conference Location: Tokyo, Japan

Funding Agency:

References is not available for this document.

Contents

I. INTRODUCTION

With the development of multimedia technology, there’s a growing need for video storage and transmission. The Joint Video Experts Team (JVET) has introduced several video coding standards, with versatile video coding (VVC) being the latest [1]. A core part of VVC, inter-prediction, minimizes temporal redundancy by finding the best match for the current Coding Unit (CU) in reference frames, thereby reducing bitrate and enhancing reference frame reliability [2]. The rapid progress of deep learning has led to an increasing number of researchers integrating neural network based tools into existing coding frameworks [3]–[11], with studies exploring its application in inter prediction through bi-prediction [12], [13], fractional interpolation [14], [15], and reference frame interpolation [3], [16], [17]. Although these NN-based methods have improved inter-prediction performance, their high computational complexity has also resulted in longer coding times and higher memory usage, limiting their practical application. In order to reduce complexity, some lightweight models have been proposed, such as reducing input channels [18]–[20], and decreasing the number of layers [18]. However, these techniques inevitably result in performance loss. The Joint Video Experts Team (JVET) emphasizes the need for low-complexity NNVC methods [21]. Inter-frame prediction techniques are also an important direction in this regard. In recent years, researchers have continuously proposed new video frame synthesis methods, aiming to achieve higher performance and lower computational complexity. Jia et al. [3] developed a technique that produces interpolated frames with a closer resemblance to the current encoding frame, yielding substantial bitrate savings. Meng et al. [22] proposed a deep reference frame interpolation network that significantly reduces computational complexity, expanding its practical applicability. Considering the already low complexity of existing solutions, our aim is to enhance performance without adding any complexity to the current approach.

References is not available for this document.

Deep Reference Frame for Versatile Video Coding with Structural Re-parameterization

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

I. INTRODUCTION

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Deep Reference Frame for Versatile Video Coding with Structural Re-parameterization

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

I. INTRODUCTION

References

IEEE Account

Purchase Details

Profile Information

Need Help?