面向多机协同的Att-MADDPG围捕控制方法设计 Design of Att-MADDPG Hunting Control Method for Multi-UAV Cooperation期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

面向多机协同的Att-MADDPG围捕控制方法设计

引用本文：	刘峰,魏瑞轩,丁超,姜龙亭,李天.面向多机协同的Att-MADDPG围捕控制方法设计[J].空军工程大学学报,2021,22(3):9-14.

作者姓名：	刘峰魏瑞轩丁超姜龙亭李天

作者单位：	空军工程大学航空工程学院,西安,710051

基金项目：	科技部“新一代人工智能”重点项目（2018AAA0102403）

摘要：	多无人机对动态目标的围捕是无人机集群作战中的重要问题.针对面向动态目标的集群围捕问题,通过分析基于MADDPG算法的围捕机制的不足,借鉴Google机器翻译团队使用的注意力机制,将注意力机制引入围捕过程,设计基于注意力机制的协同围捕策略,构建了相应的围捕算法.基于AC框架对MAD-DPG进行改进,首先,在Critic网络加入Attention模块,依据不同注意力权重对所有围捕无人机进行信息处理;然后,在Actor网络加入Attention模块,促使其他无人机进行协同围捕.仿真实验表明,Att-MAD-DPG算法较MADDPG算法的训练稳定性提高8.9％,任务完成耗时减少19.12％,经学习后的围捕无人机通过协作配合使集群涌现出更具智能化围捕行为.
关键词：	协同围捕强化学习 MADDPG 智能性涌现
Design of Att-MADDPG Hunting Control Method for Multi-UAV Cooperation

LIU Feng,WEI Ruixuan,DING Chao,JIANG Longting,LI Tian.Design of Att-MADDPG Hunting Control Method for Multi-UAV Cooperation[J].Journal of Air Force Engineering University(Natural Science Edition),2021,22(3):9-14.

Authors:	LIU Feng WEI Ruixuan DING Chao JIANG Longting LI Tian

Abstract:	The hunting of dynamic targets by multi UAV is an important problem in UAV swarm operations. In this paper, aiming at the dynamic target oriented swarm hunting problem, by analyzing the shortcomings of the hunting mechanism based on MADDPG algorithm, and learning from the attention mechanism used by Google machine translation team, we introduce the attention mechanism into the hunting process, design the cooperative hunting strategy based on the attention mechanism, and construct the corresponding hunting algorithm. Improve MADDPG based on AC framework. First of all, the attention module is added to critical network to process the information of all UAVs according to different attention weights; then, the attention module is added to actor network to promote other UAVs to carry out cooperative hunting. The simulation results show that Att MADDPG algorithm can improve the training stability by 8.9% and reduce the task completion time by 19.12% compared with MADDPG algorithm. After learning, the UAV can cooperate to make the swarm emerge more intelligent behavior.

Keywords:	cooperative hunting reinforcement learning MADDPG intelligence emergence
本文献已被 CNKI 万方数据等数据库收录！
	点击此处可从《空军工程大学学报》浏览原始摘要信息
	点击此处可从《空军工程大学学报》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏