首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于深度强化学习的海战场目标搜寻路径规划
引用本文:杨清清,高盈盈,郭玙,夏博远,杨克巍.基于深度强化学习的海战场目标搜寻路径规划[J].系统工程与电子技术,2022,44(11):3486-3495.
作者姓名:杨清清  高盈盈  郭玙  夏博远  杨克巍
作者单位:国防科技大学系统工程学院, 湖南 长沙 410073
基金项目:国家自然科学基金(72071206);国家自然科学基金(71690233);湖南省科技创新计划(2020RC4046);中国博士后基金(2019M653923)
摘    要:海战场是未来大国冲突的主阵地之一, 强大的海战场目标搜寻能力是执行海上训练和作战的最后一道屏障, 同时也因其复杂多变的环境和重要战略地位成为战场联合搜救中最艰难最核心的部分。面向海战场目标搜寻的存活时间短、实时性要求高等特点, 提出一种基于深度强化学习的海战场目标搜寻规划方法。首先, 构建了海战场目标搜寻场景数学规划模型, 并将其映射为一种强化学习模型; 然后, 基于Rainbow深度强化学习算法, 设计了海战场目标搜寻规划的状态向量、神经网络结构以及算法框架与流程。最后, 用一个案例, 验证了所提方法的可行性与有效性, 与常规应用的平行搜寻模式相比大大提高了搜寻成功率。

关 键 词:海战场  目标搜寻  路径规划  动态规划  深度强化学习  
收稿时间:2021-09-01

Target search path planning for naval battle field based on deep reinforcement learning
Qingqing YANG,Yingying GAO,Yu GUO,Boyuan XIA,Kewei YANG.Target search path planning for naval battle field based on deep reinforcement learning[J].System Engineering and Electronics,2022,44(11):3486-3495.
Authors:Qingqing YANG  Yingying GAO  Yu GUO  Boyuan XIA  Kewei YANG
Institution:College of Systems Engineering, National University of Defense Technology, Changsha 410073, China
Abstract:The naval battle field is one of the main situations of the future great power conflicts. The powerful target search capability of the naval battle field is the last protection for the implementation of maritime training and combat, and becomes the most difficult and core part of the battlefield joint search and rescue because of its complex and changeable environment and important strategic position. A path planning method based on deep reinforcement learning is proposed to solve the problem of short time cycle and high real-time requirement of target search in naval battle field. Firstly, the mathematical programming model of naval battle field target search is constructed and mapped into a reinforcement learning model. Then, based on Rainbow deep reinforcement learning algorithm, the state vector, neural network structure and algorithm framework and flow of target search planning in naval battle field are designed. Finally, a case is used to verify the feasibility and effectiveness of the proposed method, which greatly improves the search success rate compared with the conventional parallel search mode.
Keywords:naval battle field  target search  path planning  dynamic planning  deep reinforcement learning  
点击此处可从《系统工程与电子技术》浏览原始摘要信息
点击此处可从《系统工程与电子技术》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号