首页 | 本学科首页   官方微博 | 高级检索  
     

基于RBF神经网络的Q学习飞行器
引用本文:徐安,寇英信,于雷,李战武. 基于RBF神经网络的Q学习飞行器[J]. 系统工程与电子技术, 2012, 34(1): 97-101. DOI: 10.3969/j.issn.1001-506X.2012.01.18
作者姓名:徐安  寇英信  于雷  李战武
作者单位:空军工程大学工程学院, 陕西 西安 710038
基金项目:航空科学基金(20095196012)资助课题
摘    要:基于马尔科夫决策过程框架研究了三维空间内隐蔽接敌策略的强化学习方法,定义了环境模型中的优势区域和暴露区域。针对高维状态空间策略学习所面临的维数灾问题,给出基于径向基神经网络(radial basis function neural network, RBFNN)的Q学习算法,说明了训练样本的分级采样方法,并针对不同情况下的接敌机动策略学习进行了仿真分析。仿真结果表明,借助于合理的分级采样方法,基于RBFNN的Q学习算法能有效生成隐蔽接敌策略。

关 键 词:强化学习  隐蔽接敌  马尔科夫决策过程  动态规划  空战决策

Stealthy engagement maneuvering strategy with Q-learning based on RBFNN for air vehicles
XU An,KOU Ying-xin,YU Lei,LI Zhan-wu. Stealthy engagement maneuvering strategy with Q-learning based on RBFNN for air vehicles[J]. System Engineering and Electronics, 2012, 34(1): 97-101. DOI: 10.3969/j.issn.1001-506X.2012.01.18
Authors:XU An  KOU Ying-xin  YU Lei  LI Zhan-wu
Affiliation:Engineering College, Air Force Engineering University, Xi’an 710038, China
Abstract:Based on the Markov decision process theory,a reinforcement learning method for stealthy engagement strategy for air vehicles in 3D space is proposed.The advantage region and the exposure region for the environment modeling are established.In order to overcome the dimensional disaster problem,a Q-learning algorithm based on the radial basis function neural network(RBFNN) is put forward and a ranked sampling method is explained.Then simulations for two different situations are carried out,and the results show that the proposed algorithm is effective for the stealthy engagement strategy through reasonable ranked sampling methods.
Keywords:reinforcement learning  stealthy engagement  Markov decision process  dynamic programming  air combat decision
本文献已被 CNKI 等数据库收录!
点击此处可从《系统工程与电子技术》浏览原始摘要信息
点击此处可从《系统工程与电子技术》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号