首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于深度强化学习的UAV航路自主引导机动控制决策算法
引用本文:张堃,李珂,时昊天,张振冲,刘泽坤.基于深度强化学习的UAV航路自主引导机动控制决策算法[J].系统工程与电子技术,2020,42(7):1567-1574.
作者姓名:张堃  李珂  时昊天  张振冲  刘泽坤
作者单位:1. 西北工业大学电子信息学院, 陕西 西安 7100722. 光电控制技术重点实验室, 河南 洛阳 471000
基金项目:中国国家留学基金委项目(201806295012);光电控制技术重点实验室基金(6142504190105);西北工业大学硕士研究生创意创新种子基金(ZZ2019021);创新人才基金(2017KJXX-15);航空科学基金(20155153034)
摘    要:针对无人机(unmanned aerial vehicle, UAV)航路终端约束情况下航路自主引导机动控制决策问题,采用Markov决策过程模型建立UAV自主飞行机动模型,基于深度确定性策略梯度提出UAV航路自主引导机动控制决策算法,拟合UAV航路自主引导机动控制决策函数与状态动作值函数,生成最优决策网络,开展仿真验证。仿真结果表明,该算法实现了UAV在任意位置/姿态的初始条件下,向航路目标点的自主飞行,可有效提高UAV机动控制的自主性。

关 键 词:自主引导  机动控制决策  Markov决策过程  深度确定性策略梯度法  深度强化学习  
收稿时间:2019-11-20

Autonomous guidance maneuver control and decision-making algorithm
Kun ZHANG,Ke LI,Haotian SHI,Zhenchong ZHANG,Zekun LIU.Autonomous guidance maneuver control and decision-making algorithm[J].System Engineering and Electronics,2020,42(7):1567-1574.
Authors:Kun ZHANG  Ke LI  Haotian SHI  Zhenchong ZHANG  Zekun LIU
Institution:1. School of Electronics and Information, Northwestern Polytechnical University, Xi'an 710072, China2. Science and Technology on Electro-Optical Control Laboratory, Luoyang 471000, China
Abstract:To solve a specific problem involved in autonomous guidance maneuver control of the unmanned aerial vehicle (UAV) route under terminal position constraints, the autonomous flight model of the UAV is described based on Markov decision processes and the simulation environment for the training algorithm is constructed. Meanwhile, an autonomous guidance maneuver control algorithm of UAV is proposed based on deep deterministic policy gradient (DDPG) and the guidance maneuvering control function and the state-action value function are fitted by the neural network. Finally, the simulation results show that the UAV using the proposed algorithm can fly to a fixed position in horizontal plane from any position and attitude. It is proved that the proposed algorithm can effectively improve the autonomy of the UAV.
Keywords:autonomous guidance  maneuver control and decision-making  Markov decision process  deep deterministic policy gradient (DDPG) method  deep reinforcement learning  
点击此处可从《系统工程与电子技术》浏览原始摘要信息
点击此处可从《系统工程与电子技术》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号