Efficient Reinforcement Learning for Autonomous Ship Collision Avoidance under Learning Experience Reuse | IEEE Conference Publication | IEEE Xplore

Efficient Reinforcement Learning for Autonomous Ship Collision Avoidance under Learning Experience Reuse


Abstract:

In this paper, a learning experience reuse - reinforcement learning collision avoidance (LER-RLCA) method is proposed, which can synthesize near-optimal collision avoidan...Show More

Abstract:

In this paper, a learning experience reuse - reinforcement learning collision avoidance (LER-RLCA) method is proposed, which can synthesize near-optimal collision avoidance policy with efficient sampling and good seamanship, to solve the local safety sailing of autonomous ship in a multi-obstacle environment. Lying on the general reinforcement learning (RL), using learning experience reuse, the hidden features of historical training data were mined. Meanwhile, a new reward function combining external revenue signal with internal incentive signal was designed to encourage search the environment with a low probability of state transition. We further applied LER-RLCA algorithm to the simulation of autonomous ship collision avoidance. The results show that the proposed LER-RLCA algorithm can well realize the collision-free and safe navigation of autonomous ships, to avoid falling into local iteration, greatly improve the convergence speed of the algorithm, and improve the performance of online collision avoidance decision-making.
Date of Conference: 28-30 October 2022
Date Added to IEEE Xplore: 29 December 2022
ISBN Information:

ISSN Information:

Conference Location: Guangzhou, China

Funding Agency:

References is not available for this document.

I. Introduction

Ship autonomous navigation technology is a crucial technology for maritime safety guarantee, which integrates intelligent perception, anti-collision, decision-making, control, and communication. In recent years, with the development of artificial intelligence technology, intelligent learning methods have been gradually applied to the fields of robots, drones, and unmanned vehicles, in the fields of intelligent optimization scheduling, decision planning, trajectory following, and forecasting [1]–[4]. RL is an artificial intelligence-based optimization learning method. Compared with traditional optimization or planning algorithms, this method does not rely on prior knowledge and supervision information, through “trial and error” interacting with the environment, balancing exploration, and utilization, learning optimization and planning are finally realized. According to this advantage, it has received more and more attention and research about autonomous ship decision-making, planning, and control [5]–[8].

Select All
1.
N Altuntas, E Imal, N Emanet et al., "Reinforcement learning-based mobile robot navigation", Turkish Journal of Electrical Engineering & Computer Sciences, vol. 24, no. 3, pp. 1747-1767, 2016.
2.
B R Kiran, I Sobh, V Talpaert et al., "Deep reinforcement learning for autonomous driving: A survey", IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 6, pp. 4909-4926, 2021.
3.
Gao Hongbo, Zhen Kan and Keqiang Li, "Robust lateral trajectory following control of unmanned vehicle based on model predictive control", IEEE/ASME Transactions on Mechatronics, vol. 27, no. 3, pp. 1278-1287, 2022.
4.
C Wang, X Zhang and L. Wang, "Navigation Situation Adaptive Learning-Based Path Planning of Maritime Autonomous Surface Ships", 2021 6th International Conference on Transportation Information and Safety (ICTIS), pp. 342-347, 2021.
5.
Gao Hongbo et al., "Automatic parking control of unmanned vehicle based on switching control algorithm and backstepping", IEEE/ASME Transactions on Mechatronics, vol. 27, no. 3, pp. 1233-1243, 2022.
6.
C Wang, X Zhang, L Cong et al., "Research on intelligent collision avoidance decision-making of unmanned ship in unknown environments", Evolving Systems, vol. 10, no. 4, pp. 649-658, 2019.
7.
L Zhao and M I. Roh, "COLREGs-compliant multiship collision avoidance based on deep reinforcement learning", Ocean Engineering, vol. 191, pp. 106436, 2019.
8.
Gao Hongbo et al., "An interacting multiple model for trajectory prediction of intelligent vehicles in typical road traffic scenario", IEEE transactions on neural networks and learning systems, 2021.
9.
Y He, X Liu, K Zhang et al., "Dynamic adaptive intelligent navigation decision making method for multi-object situation in open water", Ocean Engineering, vol. 253, pp. 111238, 2022.
10.
Wang Chengbo, Zhang Xinyu, Zhang Jiawei et al., "Method for intelligent obstacle avoidance decision-making of unmanned vessel in unknown waters", Chinese Journal of Ship Research, vol. 13, no. 06, pp. 72-77, 2018.
11.
Zhou Shuanglin, Yang Xing, LIU Kezhong et al., "COLREGs-Compliant Method for Ship Collision Avoidance Based on Deep Reinforcement Learning", Navigation of China, vol. 43, no. 03, pp. 27-32+46, 2020.
12.
Z. Xinyu, W. Chengbo, J. Lingling et al., "Collision-avoidance navigation systems for Maritime Autonomous Surface Ships: A state of the art survey", Ocean Engineering, vol. 235, pp. 109380, 2021.
13.
X Zhang, C Wang, Y Liu et al., "Decision-making for the autonomous navigation of maritime autonomous surface ships based on scene division and deep reinforcement learning", Sensors, vol. 19, no. 18, pp. 4055, 2019.
14.
Y Dong and X. Zou, "Mobile Robot Path Planning Based on Improved DDPG Reinforcement Learning Algorithm", 2020 IEEE 11th International Conference on Software Engineering and Service Science (ICSESS), pp. 52-56, 2020.
15.
S Ding, W Du, X Zhao et al., "A new asynchronous reinforcement learning algorithm based on improved parallel PSO", Applied Intelligence, vol. 49, no. 12, pp. 4211-4222, 2019.
16.
Z Hu, K Wan, X Gao et al., "A dynamic adjusting reward function method for deep reinforcement learning with adjustable parameters", Mathematical Problems in Engineering, vol. 2019, no. 7619483, pp. 1-11, 2019.
17.
S. E. Li, Reinforcement learning for Decision-making and Control, 2022.
18.
Gao Hongbo et al., A Structure Constraint Matrix Factorization Framework for Human Behavior Segmentation, IEEE Transactionson Cybernetics, 2021.
19.
M. Sewak, Deep reinforcement learning, Springer Singapore, 2019.
20.
Wang Si-Peng, Du Chang-Ping and Zheng Yao, "Local Planner for Flapping Wing Micro Aerial Vehicle Based on Deep Reinforcement Learning", Control and Decision, pp. 1-10.
21.
Gao Hongbo et al., "Trajectory prediction of cyclist based on dynamic Bayesian network and long short-term memory model at unsignalized intersections", Science China Information Sciences, vol. 64, no. 7, pp. 1-13, 2021.
22.
E Wiewiora, G W Cottrell and C. Elkan, "Principled methods for advising reinforcement learning agents", Proceedings of the 20th International Conference on Machine Learning (ICML-03), pp. 792-799, 2003.
23.
Gao Hongbo et al., "Situational assessment for intelligent vehicles based on Stochastic model and Gaussian distributions in typical traffic scenarios", IEEE Transactions on Systems Man and Cybernetics: Systems, vol. 52, no. 3, pp. 1426-1436, 2020.
Contact IEEE to Subscribe

References

References is not available for this document.