Loading [MathJax]/extensions/MathMenu.js
Deep Reinforcement Learning Based Trajectory Design and Resource Allocation for UAV-Assisted Communications | IEEE Journals & Magazine | IEEE Xplore

Deep Reinforcement Learning Based Trajectory Design and Resource Allocation for UAV-Assisted Communications


Abstract:

In this letter, we investigate the Unmanned Aerial Vehicles (UAVs)-assisted communications in three dimensional (3-D) environment, where one UAV is deployed to serve mult...Show More

Abstract:

In this letter, we investigate the Unmanned Aerial Vehicles (UAVs)-assisted communications in three dimensional (3-D) environment, where one UAV is deployed to serve multiple user equipments (UEs). The locations and quality of service (QoS) requirement of the UEs are varying and the flying time of the UAV is unknown which depends on the battery of the UAVs. To address the issue, a proximal policy optimization 2 (PPO2)-based deep reinforcement learning (DRL) algorithm is proposed, which can control the UAV in an online manner. Specifically, it can allow the UAV to adjust its speed, direction and altitude so as to minimize the serving time of the UAV while satisfying the QoS requirement of the UEs. Simulation results are provided to demonstrate the effectiveness of the proposed framework.
Published in: IEEE Communications Letters ( Volume: 27, Issue: 9, September 2023)
Page(s): 2398 - 2402
Date of Publication: 10 July 2023

ISSN Information:

Funding Agency:

References is not available for this document.

I. Introduction

Unmanned Aerial Vehicles (UAVs)-assisted communication is expected to play an important role in future wireless communications. There are many challenges that need to be addressed before UAVs can be effectively utilized for communication purposes [1]. The existing contributions on UAV-assisted communication systems can be divided into two categories: 1) UAV is statically deployed in the air to enhance wireless communication coverage; 2) UAVs are dynamically deployed, serving as the relay of the Base Station (BS) [2], and collecting Internet of Things (IoT) data, etc. Compared with the traditional ground BS, UAV-assisted-BS offeres the advantages such as enhanced mobility and increased likelihood of Line-of-Sight (LoS) communication with users. The existing contributions on dynamic deployment of UAV are normally based on: a) convex optimization algorithms [3]; and b) deep reinforcement learning (DRL) algorithms [4]. Compared with traditional convex optimization, DRL based algorithms may have higher performance and lower time complexity [5].

Select All
1.
Y. Zeng, Q. Wu and R. Zhang, "Accessing from the sky: A tutorial on UAV communications for 5G and beyond", Proc. IEEE, vol. 107, no. 12, pp. 2327-2375, Dec. 2019.
2.
X. Li, C. Zhang, R. Zhao, C. He, H. Zheng and K. Wang, "Energy-effective offloading scheme in UAV-assisted C-RAN system", IEEE Internet Things J., vol. 9, no. 13, pp. 10821-10832, Jul. 2022.
3.
Q. Wu, Y. Zeng and R. Zhang, "Joint trajectory and communication design for multi-UAV enabled wireless networks", IEEE Trans. Wireless Commun., vol. 17, no. 3, pp. 2109-2121, Mar. 2018.
4.
R. Ding, F. Gao and X. S. Shen, "3D UAV trajectory design and frequency band allocation for energy-efficient and fair communication: A deep reinforcement learning approach", IEEE Trans. Wireless Commun., vol. 19, no. 12, pp. 7796-7809, Dec. 2020.
5.
L. Wang, K. Wang, C. Pan, W. Xu, N. Aslam and A. Nallanathan, "Deep reinforcement learning based dynamic trajectory control for UAV-assisted mobile edge computing", IEEE Trans. Mobile Comput., vol. 21, no. 10, pp. 3536-3550, Oct. 2022.
6.
Z. Wang, G. Zhang, Q. Wang, K. Wang and K. Yang, "Completion time minimization in wireless-powered UAV-assisted data collection system", IEEE Commun. Lett., vol. 25, no. 6, pp. 1954-1958, Jun. 2021.
7.
M. Li, S. He and H. Li, "Minimizing mission completion time of UAVs by jointly optimizing the flight and data collection trajectory in UAV-enabled WSNs", IEEE Internet Things J., vol. 9, no. 15, pp. 13498-13510, Aug. 2022.
8.
A. Al-Hourani, S. Kandeepan and S. Lardner, "Optimal LAP altitude for maximum coverage", IEEE Wireless Commun. Lett., vol. 3, no. 6, pp. 569-572, Dec. 2014.
9.
M. Alzenad, A. El-Keyi and H. Yanikomeroglu, "3-D placement of an unmanned aerial vehicle base station for maximum coverage of users with different QoS requirements", IEEE Wireless Commun. Lett., vol. 7, no. 1, pp. 38-41, Feb. 2018.
10.
J. Schulman, F. Wolski, P. Dhariwal, A. Radford and O. Klimov, "Proximal policy optimization algorithms", arXiv:1707.06347, 2017.
Contact IEEE to Subscribe

References

References is not available for this document.