首页 | 本学科首页   官方微博 | 高级检索  
     

Actor-Critic框架下的多智能体决策方法及其在兵棋上的应用
引用本文:李琛,黄炎焱,张永亮,陈天德. Actor-Critic框架下的多智能体决策方法及其在兵棋上的应用[J]. 系统工程与电子技术, 2021, 43(3): 755-762. DOI: 10.12305/j.issn.1001-506X.2021.03.20
作者姓名:李琛  黄炎焱  张永亮  陈天德
作者单位:1. 南京理工大学自动化学院, 江苏 南京 2100942. 陆军工程大学指挥控制工程学院, 江苏 南京 210007
基金项目:国家自然科学基金(61374186);2018年装备预研领域基金(61403120205)资助课题。
摘    要:将人工智能应用于兵棋推演的智能战术兵棋正逐年发展,基于Actor-Critic框架的决策方法可以实现智能战术兵棋的战术行动动态决策.但若Critic网络只对单算子进行评价,多算子之间的网络没有协同,本方算子之间各自行动决策会不够智能.针对上述方法的不足,提出了一种基于强化学习并结合规则的多智能体决策方法,以提升兵棋推演...

关 键 词:智能战术  兵棋推演  多智能体强化学习  Actor-Critic框架  分布执行集中训练
收稿时间:2020-05-06

Multi-agent decision-making method based on Actor-Critic framework and its application in wargame
LI Chen,HUANG Yanyan,ZHANG Yongliang,CHEN Tiande. Multi-agent decision-making method based on Actor-Critic framework and its application in wargame[J]. System Engineering and Electronics, 2021, 43(3): 755-762. DOI: 10.12305/j.issn.1001-506X.2021.03.20
Authors:LI Chen  HUANG Yanyan  ZHANG Yongliang  CHEN Tiande
Affiliation:1. School of Automation, Nanjing University of Science and Technology, Nanjing 210094, China2. Command and Control Engineering College, Army Engineering University, Nanjing 210007, China
Abstract:The intelligent tactical wargame which applies artificial intelligence to wargame deduction is developed year by year.The decision-making method based on Actor-Critic framework can realize the dynamic decision-making of tactical action of intelligent tactical wargame.However,if the Critic network only evaluates the single agent,and there is no cooperation among multiple agents,the decision-making of each agent will not be intelligent enough.In order to improve the intelligence level of wargame deduction,a multi-agent decision-making method based on reinforcement learning and rules is proposed.The decision analysis of the multi-agent action decision by using reinforcement learning is focuses,and combining with the production rules to plan tactical decision.An action decision model based on Actor-Critic framework for multi-agent distributed execution training is constructed.Compared with the closed action decision-making learning method in which each operator does not communicate with each other,the proposed distributed execution and centralized training method is more advantageous and effective.
Keywords:intelligent tactics  wargame  multi-agent reinforcement learning  Actor-Critic framework  distributed execution and centralized training
本文献已被 维普 等数据库收录!
点击此处可从《系统工程与电子技术》浏览原始摘要信息
点击此处可从《系统工程与电子技术》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号