具有精英策略的深度强化学习无人机集群通信网络拓扑设计 Topology Design of Network Based on Deep Reinforcement Learning with Strategy of Elite期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

具有精英策略的深度强化学习无人机集群通信网络拓扑设计

引用本文：	董方昊,冯有前,尹忠海,梁晓龙,周诚,李明杰.具有精英策略的深度强化学习无人机集群通信网络拓扑设计[J].空军工程大学学报,2019,20(4):52-58.

作者姓名：	董方昊冯有前尹忠海梁晓龙周诚李明杰

作者单位：	空军工程大学基础部,西安,710051;空军工程大学空管领航学院,西安,710051

基金项目：	国家自然科学基金(61472443)

摘要：	针对集群无人机背景下定向天线网络拓扑设计的NP-hard特点，基于网络高抗毁、低功耗、高稳定性等要求，以抗毁性（3-连通）、链路量、链路功耗和稳定性为奖励，提出了一种具有精英策略的深度强化学习通信网络拓扑生成算法，验证了精英经验池加速训练效果。与传统DQN相比，引入精英经验池能够有效加速模型收敛，训练时间减少3倍以上。与遗传算法相比，算法分离了训练与使用过程，当网络训练完成后，能够根据场景需要实时计算通信网络拓扑。实验阶段设计了随机给定空间位置的6节点、10节点、24节点和36节点的3-连通通信网络拓扑。实验结果表明:所提算法具有强的实时性和适用性，对于不大于36节点的网络，可在183 ms内实现网络拓扑的更新计算，达到了实际应用的实时性要求。
关键词：	深度强化学习精英经验池通信网络连通度通信网络拓扑
Topology Design of Network Based on Deep Reinforcement Learning with Strategy of Elite

DONG Fanghao,FENG Youqian,YIN Zhonghai,LIANG Xiaolong,ZHOU Cheng,LI Mingjie.Topology Design of Network Based on Deep Reinforcement Learning with Strategy of Elite[J].Journal of Air Force Engineering University(Natural Science Edition),2019,20(4):52-58.

Authors:	DONG Fanghao FENG Youqian YIN Zhonghai LIANG Xiaolong ZHOU Cheng LI Mingjie

Abstract:	Aiming at the NP-hard characteristics of directional antenna network topology design under cluster UAV background, an elite strategy for deep reinforcement learning communication network topology generation algorithm is introduced with the requirements of high survivability, low power consumption and high stability of the network, which has the rewarding of invulnerability (3-connectivity), link quantity, link power consumption and stability. Compared with traditional DQN, elite experience pool verifies the acceleration training effect by effectively accelerating the convergence of the model and reducing the training time by more than three times. Rather than genetic algorithm, this algorithm separates the processes of use and training . When the network training is completed, the communication network topology can be calculated in real time with the needs of scene. In experimental stage, a 3-connected communication network topology with randomly given spatial location is designed which includes 6 nodes, 10 nodes, 24 nodes and 36 nodes . The experimental results has shown that this proposed algorithm has strong real time and applicability, it can help network topology which has less than 36 nodes update in 183 ms so that meeting the real time requirements of practical application.

Keywords:	deep reinforcement learning elite experience pool connectivity communication network topology
本文献已被万方数据等数据库收录！
	点击此处可从《空军工程大学学报》浏览原始摘要信息
	点击此处可从《空军工程大学学报》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏