Loading [MathJax]/extensions/MathMenu.js
Rate-Distortion-Perception Controllable Joint Source-Channel Coding for High-Fidelity Generative Semantic Communications | IEEE Journals & Magazine | IEEE Xplore

Rate-Distortion-Perception Controllable Joint Source-Channel Coding for High-Fidelity Generative Semantic Communications

; ; ; ; ;

Abstract:

End-to-end image transmission has recently become a crucial trend in intelligent wireless communications, driven by the increasing demand for high bandwidth efficiency. H...Show More

Abstract:

End-to-end image transmission has recently become a crucial trend in intelligent wireless communications, driven by the increasing demand for high bandwidth efficiency. However, existing methods primarily optimize the trade-off between bandwidth cost and objective distortion, often failing to deliver visually pleasing results aligned with human perception. In this paper, we propose a novel rate-distortion-perception (RDP) jointly optimized joint source-channel coding (JSCC) framework to enhance perception quality in human communications. Our RDP-JSCC framework integrates a flexible plug-in conditional Generative Adversarial Networks (GANs) to provide detailed and realistic image reconstructions at the receiver, overcoming the limitations of traditional rate-distortion optimized solutions that typically produce blurry or poorly textured images. Based on this framework, we introduce a distortion-perception controllable transmission (DPCT) model, which addresses the variation in the perception-distortion trade-off. DPCT uses a lightweight spatial realism embedding module (SREM) to condition the generator on a realism map, enabling the customization of appearance realism for each image region at the receiver from a single transmission. Furthermore, for scenarios with scarce bandwidth, we propose an interest-oriented content-controllable transmission (CCT) model. CCT prioritizes the transmission of regions that attract user attention and generates other regions from an instance label map, ensuring both content consistency and appearance realism for all regions while proportionally reducing channel bandwidth costs. Comprehensive experiments demonstrate the superiority of our RDP-optimized image transmission framework over state-of-the-art engineered image transmission systems and advanced perceptual methods.
Page(s): 672 - 686
Date of Publication: 05 December 2024

ISSN Information:

Funding Agency:

No metrics found for this document.

I. Introduction

The rapid expansion of ultra-large-scale image/video transmission applications in camera phones and extended reality devices continues to drive the demand for efficient transmission of large media data under limited bandwidth conditions. Traditional communication systems, based on the source-channel separation paradigm, utilize rate-distortion theory for source coding and channel coding theory for transmission. These systems aim to minimize the size of source data under a distortion constraint while ensuring reliable data transmission over noisy channels. However, they can cause significant bandwidth waste due to their focus on global bit information rather than critical semantic information. To address this inefficiency, recent advancements in deep learning have inspired data-driven solutions that extract semantic feature information [1], [2], [3], [4] and implement joint source-channel coding (JSCC) [5], [6], [7], [8], [9], [10] for end-to-end communications.

Usage
Select a Year
2025

View as

Total usage sinceDec 2024:229
01020304050JanFebMarAprMayJunJulAugSepOctNovDec453439000000000
Year Total:118
Data is updated monthly. Usage includes PDF downloads and HTML views.
Contact IEEE to Subscribe

References

References is not available for this document.