首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于MDP框架的飞行器隐蔽接敌策略
引用本文:徐安,于雷,寇英信,徐保伟,李战武.基于MDP框架的飞行器隐蔽接敌策略[J].系统工程与电子技术,2011,33(5):1063.
作者姓名:徐安  于雷  寇英信  徐保伟  李战武
作者单位:空军工程大学工程学院, 陕西 西安 710038
摘    要:基于近似动态规划(approximate dynamic programming, ADP)对空战飞行器隐蔽接敌决策问题进行研究。基于作战飞行器的战术使用原则,提出了隐蔽接敌过程中的优势区域与暴露区域;构建了基于马尔科夫决策过程(Markov decision process, MDP)的隐蔽接敌策略的强化学习方法;通过态势得分函数对非连续的即时收益函数进行修正,给出了基于ADP方法的策略学习与策略提取方法。分别针对对手在有无信息源支持情况下的不同机动对策进行了仿真验证。仿真结果表明,将ADP方法应用于隐蔽接敌策略的学习是可行的, 在不同态势下可获得较为有效的接敌策略。

关 键 词:隐蔽接敌  马尔科夫决策过程  近似动态规划  空战决策  近似值函数

Stealthy engagement maneuvering strategy for air combat based on MDP
XU An,YU Lei,KOU Ying-xin,XU Bao-wei,LI Zhan-wu.Stealthy engagement maneuvering strategy for air combat based on MDP[J].System Engineering and Electronics,2011,33(5):1063.
Authors:XU An  YU Lei  KOU Ying-xin  XU Bao-wei  LI Zhan-wu
Institution:Engineering College, Air Force Engineering University, Xi’an 710038, China
Abstract:The stealthy engagement maneuvering strategy for air combat based on approximate dynamic programming (ADP) is studied. The advantage region and the exposure region are proposed based on the operational principles, and the stealthy engagement decision framework based on the Markov decision process (MDP) is established and the value iteration method based on the ADP is proposed. The immediate reward function is modified by a situation scoring function and the strategy learning method and the policy extraction method are explained. Finally, the policies in different situations when the adversary have access to outer information source and without the information are validated. The simulation results show that the application of ADP in the stealthy engagement of air combat is feasible and the policy extracted by this method in different initial situation is effective.
Keywords:
本文献已被 万方数据 等数据库收录!
点击此处可从《系统工程与电子技术》浏览原始摘要信息
点击此处可从《系统工程与电子技术》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号