Loading [MathJax]/extensions/MathMenu.js
MMVC: Learned Multi-Mode Video Compression with Block-based Prediction Mode Selection and Density-Adaptive Entropy Coding | IEEE Conference Publication | IEEE Xplore

MMVC: Learned Multi-Mode Video Compression with Block-based Prediction Mode Selection and Density-Adaptive Entropy Coding


Abstract:

Learning-based video compression has been extensively studied over the past years, but it still has limitations in adapting to various motion patterns and entropy models....Show More

Abstract:

Learning-based video compression has been extensively studied over the past years, but it still has limitations in adapting to various motion patterns and entropy models. In this paper, we propose multi-mode video compression (MMVC), a block wise mode ensemble deep video compression framework that selects the optimal mode for feature domain prediction adapting to different motion patterns. Proposed multi-modes include ConvLSTM-based feature domain prediction, optical flow conditioned feature domain prediction, and feature propagation to address a wide range of cases from static scenes without apparent motions to dynamic scenes with a moving camera. We partition the feature space into blocks for temporal prediction in spatial block-based representations. For entropy coding, we consider both dense and sparse post-quantization residual blocks, and apply optional run-length coding to sparse residuals to improve the compression rate. In this sense, our method uses a dual-mode entropy coding scheme guided by a binary density map, which offers significant rate reduction surpassing the extra cost of transmitting the binary selection map. We validate our scheme with some of the most popular benchmarking datasets. Compared with state-of-the-art video compression schemes and standard codecs, our method yields better or competitive results measured with PSNR and MS-SSIM.
Date of Conference: 17-24 June 2023
Date Added to IEEE Xplore: 22 August 2023
ISBN Information:

ISSN Information:

Conference Location: Vancouver, BC, Canada

1. Introduction

Over the past several years, with the emergence and booming of short videos and video conferences across the world, video has become the major container of information and interaction among people on a daily basis. Conse-quently, we have been witnessing a vast demand increase on transmission bandwidth and storage space, together with the vibrant growth and discovery of handcrafted codecs such as AVC/H.264 [23], HEVC [23], and the recently released VVC [22], along with a number of learning based methods [1], [7], [9], [11], [12], [15]–[17], [21], [27], [30], [31].

Contact IEEE to Subscribe

References

References is not available for this document.