Loading [MathJax]/extensions/MathZoom.js
Counterfactual Reward Estimation for Credit Assignment in Multi-agent Deep Reinforcement Learning over Wireless Video Transmission | IEEE Conference Publication | IEEE Xplore

Counterfactual Reward Estimation for Credit Assignment in Multi-agent Deep Reinforcement Learning over Wireless Video Transmission


Abstract:

This study investigates frame-wise optimization in Mobile Edge Computing (MEC) for video transmission, emphasizing dynamic adaptation to diverse frame complexities and ef...Show More

Abstract:

This study investigates frame-wise optimization in Mobile Edge Computing (MEC) for video transmission, emphasizing dynamic adaptation to diverse frame complexities and efficient resource utilization. The comprehensive system model captures the complexities of joint optimizations in MEC for real-time video transmission, addressing challenges associated with error concealment techniques, and enhancing the user experience by addressing successive frame losses. To handle credit assignment in multi-agent scenarios, we integrate counterfactual reward shaping, introducing a counterfactual reward multi-agent proximal policy optimization (CRMAPPO). Results reveal the impact of the credit assignment parameter (β) on algorithm performance, demonstrating a trade-off between accurate credit assignment and policy bias. The study emphasizes CRMAPPO's performance, surpassing traditional MAPPO under optimal β choices, marking a substantial 109.18% improvement in total rewards. This research significantly contributes to optimizing resource allocation in video transmission within MEC frameworks, addressing challenges associated with frame-wise optimization and providing a nuanced understanding of credit assignment dynamics in multi-agent environments.
Date of Conference: 23-26 July 2024
Date Added to IEEE Xplore: 22 August 2024
ISBN Information:

ISSN Information:

Conference Location: Jersey City, NJ, USA

Funding Agency:


I. Introduction

Efficiently allocating resources in Mobile Edge Computing (MEC) for real-time video faces challenges due to dynamic content [1], [2]. Existing strategies, like batch frame optimization or tile-based approaches [3], [4], struggle with the variability of video frames. In applications demanding real-time responsiveness, such as Extended Reality (XR), gaming, and telemedicine, these challenges become even more critical. Traditional strategies may fall short in addressing the varied demands posed by individual frames in dynamic video content, whereas frame-wise optimization is promising for achieving more precise resource allocation and addressing the dynamics. Additionally, the varying complexities of video frames necessitate a dynamic approach to error concealment, a crucial strategy for correcting distortions in transmitted video frames. By implementing frame-wise optimization, it's possible to tailor error concealment strategies to match the specific needs of each frame, ensuring more efficient video transmission. Further, this system necessitates optimizing various elements like frame resolutions, computing, and communication resources. Multi-Agent Deep Reinforcement Learning (MADRL) is effective for multiple optimization targets in wireless networks [5], but faces challenges with credit assignment due to mixed feedback among agents. This situation calls for advanced methods to enhance efficiency and stability in credit assignment management within MADRL algorithms.

Contact IEEE to Subscribe

References

References is not available for this document.