I. Introduction
Unmanned aerial vehicles (UAVs) have proved eye-catching in all walks of life owing to their portability, flexibility and high mobility. UAV target tracking is an important part of UA V technology; however, at present, for most scenarios, UA V operators need a lot of training before they get good control of UAVs, which means an increase of unnecessary time in task execution and thus lack of efficiency. Nowadays, with the rapid development of artificial intelligence technology, UAVs in the future are expected to constantly master the environmental state, and make autonomous flight decisions. Applying artificial intelligence methods to the UAV target tracking field will be an inevitable trend. At present, most of the UAV target tracking methods turn to the traditional control way, such as PID control method, Lyapunov control method, backstep control method[1] [4], etc., which are less time-dependent and do not use cameras and other sensors to assist UA V flight [1]–[4]. This paper, instead of adopting the traditional way, chooses the TD3 algorithm using deep reinforcement learning to control the flight of a UA V by combining image information and its state information as input and continuously output its actions to make autonomous flight decisions and target tracking possible. At the same time, the GRU module is added to control input and store historical information to enhance the UA V's ability to process time series information and perceive the environment. The visual navigation model proposed by this study is designed for UAV target tracking and decision-making based on a reinforcement learning algorithm fused with GRU, which is superior to other traditional algorithms in data processing and sequence feature mining. Finally, the effectiveness of the model is verified by experiments regarding UA V flight decision-making and obstacle avoidance.