I. Introduction
With the development of marine science and the advancement of artificial intelligence and intelligent systems, the autonomous underwater vehicles (AUVs) are evolving towards the self-learning and adaptive [1]. At present, the most of AUVs used for deep-water exploration are underactuated AUVs [2]. It’s only includes stern thruster generally, and steering and pitching are realized through vector propulsion or rudder [3], [4]. Path planning is one of the core problems in the underactuated AUV fields. Its purpose is to find an optimal path from the beginning to the end [5], [6]. The path planning environment is either static or dynamic. In a static environment, the global environmental information such as terrain, obstacles and disturbances is known and a path can be planned ahead of the detection. However, for the dynamic environments, the global environmental information is unknown and the path needs to be planned in real-time [7]. Relatively speaking, the real time path planning in the dynamic environments has more practical significance and great difficulty.