Deep Reference Frame for Versatile Video Coding with Structural Re-parameterization | IEEE Conference Publication | IEEE Xplore

Deep Reference Frame for Versatile Video Coding with Structural Re-parameterization


Abstract:

In video coding, inter-prediction leverages neigh-boring frames to reduce temporal redundancy. The quality of these reference frames is essential for effective inter-pred...Show More

Abstract:

In video coding, inter-prediction leverages neigh-boring frames to reduce temporal redundancy. The quality of these reference frames is essential for effective inter-prediction. Although many neural network-based methods have been proposed to improve the quality of reference frames, there is still room for the performance and efficiency trade-off. In this paper, we propose an interpolation diverse branch block (InterDBB) suitable for lightweight frame interpolation networks, which optimizes deep reference frame interpolation networks to improve performance without sacrificing speed and increasing complexity. Specifically, we propose a multi-branch structural reparameterization block without batch normalization. This straightforward yet effective modification ensures training stability and performance improvement. Moreover, we propose a parameterized motion estimation strategy based on different input resolution, to achieve a better trade-off between performance and computational complexity. Experimental results demonstrate that our method achieves -2.01%/-2.87%/-2.44% coding efficiency improvements for Y/U/V components under random access (RA) configuration compared to VTM-11.0_NNVC-5.0.
Date of Conference: 08-11 December 2024
Date Added to IEEE Xplore: 27 January 2025
ISBN Information:

ISSN Information:

Conference Location: Tokyo, Japan

Funding Agency:

No metrics found for this document.

I. INTRODUCTION

With the development of multimedia technology, there’s a growing need for video storage and transmission. The Joint Video Experts Team (JVET) has introduced several video coding standards, with versatile video coding (VVC) being the latest [1]. A core part of VVC, inter-prediction, minimizes temporal redundancy by finding the best match for the current Coding Unit (CU) in reference frames, thereby reducing bitrate and enhancing reference frame reliability [2]. The rapid progress of deep learning has led to an increasing number of researchers integrating neural network based tools into existing coding frameworks [3]–[11], with studies exploring its application in inter prediction through bi-prediction [12], [13], fractional interpolation [14], [15], and reference frame interpolation [3], [16], [17]. Although these NN-based methods have improved inter-prediction performance, their high computational complexity has also resulted in longer coding times and higher memory usage, limiting their practical application. In order to reduce complexity, some lightweight models have been proposed, such as reducing input channels [18]–[20], and decreasing the number of layers [18]. However, these techniques inevitably result in performance loss. The Joint Video Experts Team (JVET) emphasizes the need for low-complexity NNVC methods [21]. Inter-frame prediction techniques are also an important direction in this regard. In recent years, researchers have continuously proposed new video frame synthesis methods, aiming to achieve higher performance and lower computational complexity. Jia et al. [3] developed a technique that produces interpolated frames with a closer resemblance to the current encoding frame, yielding substantial bitrate savings. Meng et al. [22] proposed a deep reference frame interpolation network that significantly reduces computational complexity, expanding its practical applicability. Considering the already low complexity of existing solutions, our aim is to enhance performance without adding any complexity to the current approach.

Usage
Select a Year
2025

View as

Total usage sinceFeb 2025:26
05101520JanFebMarAprMayJunJulAugSepOctNovDec01610000000000
Year Total:26
Data is updated monthly. Usage includes PDF downloads and HTML views.

Contact IEEE to Subscribe

References

References is not available for this document.