I. Introduction
In recent years, deep reinforcement learning (DRL) has received extensive attention due to its powerful capabilities for learning and decision, especially in the field of robot control to realize versatile tasks that are challenging to be implemented by conventional methods, such as picking in cluttered environment [1] and deformable manipulation [2]. Particularly, robots often work in complex workspace, in which serious damages and safety risks might be engendered if collisions occurred. Hence, rigorous requirements are placed on the robot's ability to avoid obstacles.