Conferences >2022 61st Annual Conference o...

Centralized Training with Decentralized Execution Reinforcement Learning for Cooperative Multi-agent Systems with Communication Delay

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

In cooperative multi-agent systems, efficient coordination among agents is important when accomplishing tasks. VFFAC is a method that learns the communication system betw...Show More

Metadata

Abstract:

In cooperative multi-agent systems, efficient coordination among agents is important when accomplishing tasks. VFFAC is a method that learns the communication system between agents and their interactions with the environment to obtain policies with high performance. However, this method results in decreased performance of policy in environments with a delay in communication. Furthermore, there is no formulation of the control problem of a cooperative multi-agent system with communication delays in unknown environments. In this study, we formulated a decision-making problem in a cooperative multi-agent system with an unknown environment model and a certain length delay in communication. We also propose a method to handle communication delays by using the history of information obtained through communication. We demonstrated that the proposed method successfully learns policy with high rewards through simulated experiments in an environment with a communication delay.

Published in: 2022 61st Annual Conference of the Society of Instrument and Control Engineers (SICE)

Date of Conference: 06-09 September 2022

Date Added to IEEE Xplore: 06 October 2022

ISBN Information:

DOI: 10.23919/SICE56594.2022.9905866

Conference Location: Kumamoto, Japan

References is not available for this document.

Contents

1. INTRODUCTION

Reinforcement learning (RL) [1] is a framework of machine learning in which an agent learns effective policies to accomplish a task by trial and error in an unknown environment. The agent obtains rewards through trial and error against the environment, and aims to identify a policy that maximizes the discounted cumulative reward.

Select All

R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, The MIT Press, 2018.

Google Scholar

L. Busoniu, R. Babuska and B. De Schutter, "A Comprehensive Survey of Multiagent Reinforcement Learning", IEEE Transactions on Systems Man and Cybernetics Part C (Applications and Reviews), vol. 38, no. 2, pp. 156-172, 2008.

View Article

Google Scholar

M. Hüttenrauch, A. Šošić and G. Neumann, "Guided Deep Reinforcement Learning for Swarm Systems", AAMAS 2017 Autonomous Robots and Multirobot Systems (ARMS) Workshop, 2017.

Google Scholar

S. Shalev-Shwartz, S. Shammah and A. Shashua, "Safe Multi-Agent Reinforcement Learning for Autonomous Driving", 2016.

Google Scholar

S. Grigorescu, B. Trasnea, T. Cocias and G. Macesanu, "A survey of deep learning techniques for autonomous driving", Journal of Field Robotics, vol. 37, no. 3, pp. 362-386, 2020.

CrossRef Google Scholar

J. Foerster, I. A. Assael, N. de Freitas and S. Whiteson, "Learning to Communicate with Deep Multi-Agent Reinforcement Learning" in Advances in Neural Information Processing Systems, Curran Associates, Inc, vol. 29, 2016.

Google Scholar

T. Wang, J. Wang, C. Zheng and C. Zhang, "Learning Nearly Decomposable Value Functions Via Communication Minimization", 2020.

Google Scholar

B. Wu, X. Yang, C. Sun, R. Wang, X. Hu and Y. Hu, "Learning Effective Value Function Factorization via Attentional Communication", 2020 IEEE International Conference on Systems Man and Cybernetics (SMC), pp. 629-634, 2020.

View Article

Google Scholar

F. A. Oliehoek, M. T. Spaan and N. Vlassis, "Dec-POMDPs with delayed communication", The 2nd Workshop on Multi-agent Sequential Decision-Making in Uncertain Domains, 2007.

Google Scholar

10.

Y. Cao, W. Yu, W. Ren and G. Chen, "An Overview of Recent Progress in the Study of Distributed Multi-Agent Coordination", IEEE Transactions on Industrial Informatics, vol. 9, no. 1, pp. 427-438, 2013.

View Article

Google Scholar

11.

F. A. Oliehoek, "Value-Based Planning for Teams of Agents in Stochastic Partially Observable Environments", PhD thesis, 2010.

CrossRef Google Scholar

12.

F. A. Oliehoek and C. Amato, A Concise Introduction to Decentralized POMDPs, Springer International Publishing, 2016.

Google Scholar

13.

J. Foerster, G. Farquhar, T. Afouras, N. Nardelli and S. Whiteson, "Counterfactual Multi-Agent Policy Gradients", Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32, no. 1, 2018.

CrossRef Google Scholar

14.

T. Rashid, M. Samvelyan, C. Schroeder, G. Far-quhar, J. Foerster and S. Whiteson, "Qmix: Monotonic value function factorisation for deep multi-agent reinforcement learning", International Conference on Machine Learning, pp. 4295-4304, 2018.

Google Scholar

15.

K. Cho, B. van Merrienboer, C. Gulcehre, D. Bah-danau, F. Bougares, H. Schwenk, et al., "Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation", 2014.

CrossRef Google Scholar

16.

D. Ha, A. Dai and Q. V. Le, "HyperNetworks", 2016.

Google Scholar

References is not available for this document.

Centralized Training with Decentralized Execution Reinforcement Learning for Cooperative Multi-agent Systems with Communication Delay

Abstract:

Metadata

Abstract:

1. INTRODUCTION

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Centralized Training with Decentralized Execution Reinforcement Learning for Cooperative Multi-agent Systems with Communication Delay

Alerts

Abstract:

Metadata

Abstract:

1. INTRODUCTION

References

IEEE Account

Purchase Details

Profile Information

Need Help?