Multi-Agent Reinforcement Learning-Based Buffer-Aided Relay Selection in IRS-Assisted Secure Cooperative Networks | IEEE Journals & Magazine | IEEE Xplore

Multi-Agent Reinforcement Learning-Based Buffer-Aided Relay Selection in IRS-Assisted Secure Cooperative Networks


Abstract:

This paper proposes a multi-agent deep reinforcement learning-based buffer-aided relay selection scheme for an intelligent reflecting surface (IRS)-assisted secure cooper...Show More

Abstract:

This paper proposes a multi-agent deep reinforcement learning-based buffer-aided relay selection scheme for an intelligent reflecting surface (IRS)-assisted secure cooperative network in the presence of an eavesdropper. We consider a practical phase model where both phase shift and reflection amplitude are discrete variables to vary the reflection coefficients of the IRS. Furthermore, we introduce the buffer-aided relay to enhance the secrecy performance, but the use of the buffer leads to the cost of delay. Thus, we aim to maximize either the average secrecy rate with a delay constraint or the throughput with both delay and secrecy constraints, by jointly optimizing the buffer-aided relay selection and the IRS reflection coefficients. To obtain the solution of these two optimization problems, we divide each of the problems into two sub-tasks and then develop a distributed multi-agent reinforcement learning scheme for the two cooperative sub-tasks, each relay node represents an agent in the distributed learning. We apply the distributed reinforcement learning scheme to optimize the IRS reflection coefficients, and then utilize an agent on the source to learn the optimal relay selection based on the optimal IRS reflection coefficients in each iteration. Simulation results show that the proposed learning-based scheme uses an iterative approach to learn from the environment for approximating an optimal solution via the exploration of multiple agents, which outperforms the benchmark schemes.
Page(s): 4101 - 4112
Date of Publication: 06 August 2021

ISSN Information:

Funding Agency:


I. Introduction

With the development of the fifth-generation (5G) wireless communication, physical layer security has been widely studied to provide secure wireless communications in recent years [1]. Unlike cryptographic techniques, physical layer security exploits the dynamics of fading channels for achieving the perfect secrecy performance and dose not require encryption keys [2]–[4]. Security is also particularly relevant for cooperative communication networks, which has been investigated in [3], [5], [6]. In [3], the secrecy rate performance of full-duplex (FD) decode-and-forward (DF) cooperative networks was studied with a self-interference cancellation technology. The authors in [5] proposed two linear precoding schemes to improve the secrecy rate performance in half-duplex (HD) amplify-and-forward (AF) relaying systems. To maximize the diversity gain, relay selection was also proposed in cooperative networks to reduce the secrecy outage probability in [6]. To further enhance the secrecy performance, a novel max-ratio buffer-aided relay selection was proposed to select the link with the largest signal-to-interference-ratio (SIR) in cooperative networks with buffering technology [7]. Then, the trade-off between the average delay and secrecy rate for the max-ratio scheme was investigated in a buffer-aided cooperative network [8]. In [9], the max-ratio and state-based schemes were amalgamated to reduce the secrecy outage probability and average delay for buffer-aided cooperative networks. Furthermore, in [10], the average secrecy rate in an energy-harvesting based buffer-aided cooperative network was enhanced by an adaptive transmission algorithm considering power constraints, buffer and energy storage. Although using buffer improves outage performance, it increases the instantaneous delay, which is a key issue in Internet of Things (IoT) networks [11]. A buffer-state-based probabilistic relay selection method was proposed to enhance the outage performance with delay constraint in [12]. In [13], the delay constrained throughput was investigated via deep reinforcement learning (DRL). However, physical layer security has not been considered in delay-constraint buffer-aided cooperative networks. This motivates us to study security communication systems to satisfy instantaneous delay constraints.

Contact IEEE to Subscribe

References

References is not available for this document.