基于LSTM深度强化学习的UAV反应式避障方法_杨秀霞 : . 飞行力学 UAV 在不同环境中迅 速避障。在仿真实验中与多种算法进行对比分析,证明了所提算法具有良好的泛化性和有效性。 关 键 词:无人机;长短时记忆神经网路;深度确定性策略梯度算法;反应式避障;不确定环境 中图分类号:V249 文献标识码:A UAV deep reinforcement learning reactive obstacle avoidance method based on LSTM YANG Xiuxia, GAO Hengjie, LIU Wei, ZHANG Yi (Naval Aviation University,Yantai 264001,China) Abstract: Aiming at the problem that traditional obstacle avoidance methods are difficult to apply to complex and multi-obstacle uncertain environments, a UAV reactive obstacle avoidance method based on LSTM-DDPG is proposed. Firstly, the DDPG obstacle avoidance method based on the speed obstacle method is used to give the UAV obstacle avoidance strategy in a dynamic environment. Secondly, the LSTM neural network is introduced to improve the DDPG obstacle avoidance method, which solves the problem that the method cannot represent the status information for different numbers of obstacles.. Finally, the obstacle change strategy is designed to make the UAV learning in a random training environment, so that the UAV can quickly avoid obstacles in different environments. In the simulation experiment, this method is compared and analyzed with a variety of algorithms, which proves that the proposed algorithm has good generalization and effectiveness. Key words:UAV;LSTM;DDPG;reactive obstacle avoidance;uncertain environment 挑战,传统的障碍规避方法,如 A*算法[1-2]、人 0 引言