Loading [MathJax]/extensions/MathMenu.js
Toward Extreme Image Compression With Latent Feature Guidance and Diffusion Prior | IEEE Journals & Magazine | IEEE Xplore

Toward Extreme Image Compression With Latent Feature Guidance and Diffusion Prior


Abstract:

Image compression at extremely low bitrates (below 0.1 bits per pixel (bpp)) is a significant challenge due to substantial information loss. In this work, we propose a no...Show More

Abstract:

Image compression at extremely low bitrates (below 0.1 bits per pixel (bpp)) is a significant challenge due to substantial information loss. In this work, we propose a novel two-stage extreme image compression framework that exploits the powerful generative capability of pre-trained diffusion models to achieve realistic image reconstruction at extremely low bitrates. In the first stage, we treat the latent representation of images in the diffusion space as guidance, employing a VAE-based compression approach to compress images and initially decode the compressed information into content variables. The second stage leverages pre-trained stable diffusion to reconstruct images under the guidance of content variables. Specifically, we introduce a small control module to inject content information while keeping the stable diffusion model fixed to maintain its generative capability. Furthermore, we design a space alignment loss to force the content variables to align with the diffusion space and provide the necessary constraints for optimization. Extensive experiments demonstrate that our method significantly outperforms state-of-the-art approaches in terms of visual performance at extremely low bitrates. The source code and trained models are available at https://github.com/huai-chang/DiffEIC.
Page(s): 888 - 899
Date of Publication: 06 September 2024

ISSN Information:

Funding Agency:


I. Introduction

Extreme image compression, which aims to compress images at bitrates below 0.1 bits per pixel (bpp), is critical in very bandwidth-constrained scenarios, such as satellite communications. Traditional compression standards, such as JPEG2000 [1], BPG [2], and VVC [3], are widely used in practice. However, these algorithms produce severe blocking artifacts at extremely low bitrates due to their block-based processing, see Fig. 1(b).

Visual examples of the reconstructed results on the Kodak [22] dataset. The proposed DiffEIC produces much better results in terms of perception and fidelity. For example, the small attic is well reconstructed.

Contact IEEE to Subscribe

References

References is not available for this document.