首页 | 本学科首页   官方微博 | 高级检索  
     检索      

Actor-Critic框架下的多智能体决策方法及其在兵棋上的应用
引用本文:李琛,黄炎焱,张永亮,陈天德.Actor-Critic框架下的多智能体决策方法及其在兵棋上的应用[J].系统工程与电子技术,2021,43(3):755-762.
作者姓名:李琛  黄炎焱  张永亮  陈天德
作者单位:1. 南京理工大学自动化学院, 江苏 南京 2100942. 陆军工程大学指挥控制工程学院, 江苏 南京 210007
基金项目:国家自然科学基金(61374186);2018年装备预研领域基金(61403120205)资助课题。
摘    要:将人工智能应用于兵棋推演的智能战术兵棋正逐年发展, 基于Actor-Critic框架的决策方法可以实现智能战术兵棋的战术行动动态决策。但若Critic网络只对单算子进行评价, 多算子之间的网络没有协同, 本方算子之间各自行动决策会不够智能。针对上述方法的不足, 提出了一种基于强化学习并结合规则的多智能体决策方法, 以提升兵棋推演的智能水平。侧重采用强化学习对多算子的行动决策进行决策分析, 并结合产生式规则对战术决策进行规划。构建基于Actor-Critic框架的多算子分布执行集中训练的行动决策模型, 对比每个算子互不交流的封闭式行动决策学习方法, 提出的分布执行集中训练方法更具优势且有效。

关 键 词:智能战术  兵棋推演  多智能体强化学习  Actor-Critic框架  分布执行集中训练  
收稿时间:2020-05-06

Multi-agent decision-making method based on Actor-Critic framework and its application in wargame
LI Chen,HUANG Yanyan,ZHANG Yongliang,CHEN Tiande.Multi-agent decision-making method based on Actor-Critic framework and its application in wargame[J].System Engineering and Electronics,2021,43(3):755-762.
Authors:LI Chen  HUANG Yanyan  ZHANG Yongliang  CHEN Tiande
Institution:1. School of Automation, Nanjing University of Science and Technology, Nanjing 210094, China2. Command and Control Engineering College, Army Engineering University, Nanjing 210007, China
Abstract:The intelligent tactical wargame which applies artificial intelligence to wargame deduction is developed year by year.The decision-making method based on Actor-Critic framework can realize the dynamic decision-making of tactical action of intelligent tactical wargame.However,if the Critic network only evaluates the single agent,and there is no cooperation among multiple agents,the decision-making of each agent will not be intelligent enough.In order to improve the intelligence level of wargame deduction,a multi-agent decision-making method based on reinforcement learning and rules is proposed.The decision analysis of the multi-agent action decision by using reinforcement learning is focuses,and combining with the production rules to plan tactical decision.An action decision model based on Actor-Critic framework for multi-agent distributed execution training is constructed.Compared with the closed action decision-making learning method in which each operator does not communicate with each other,the proposed distributed execution and centralized training method is more advantageous and effective.
Keywords:intelligent tactics  wargame  multi-agent reinforcement learning  Actor-Critic framework  distributed execution and centralized training
本文献已被 维普 等数据库收录!
点击此处可从《系统工程与电子技术》浏览原始摘要信息
点击此处可从《系统工程与电子技术》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号