首页 | 本学科首页   官方微博 | 高级检索  
     检索      

Multi-agent reinforcement learning based on policies of global objective
作者姓名:张化祥  黄上腾
作者单位:Dept .of Computer Science,Shandong Normal Univ,Dept .of Computer Science and Engineering,Shanghai Jiaotong Univ Jinan 250014,P. R. China,Shanghai 200030,P. R. China
摘    要:1 .INTRODUCTIONBecause an agent’s rewardis a function of all agents’joint action, when applying RL1]to multi-agent do-mains ,some fundamental change should be made .Byadopting single agentQlearning2]to Markovgames,several algorithms have been proposed,suchas Littman’s mini maxQ-learning( mini max-Q)3],Hu et al’s NashQ-learning(Nash-Q)4 ,5], Claus etal’s cooperative multi-agentQ-learning6], Bowlinget al’s multi-agent learningQ-learning using a vari-able learning rate7 ~9],…


Multi-agent reinforcement learning based on policies of global objective
Zhang Huaxiang,Huang Shangteng.Multi-agent reinforcement learning based on policies of global objective[J].Journal of Systems Engineering and Electronics,2005,16(3).
Authors:Zhang Huaxiang  Huang Shangteng
Institution:1. Dept.of Computer Science,Shandong Normal Univ.,Jinan 250014,P.R.China
2. Dept.of Computer Science and Engineering,Shanghai Jiaotong Univ.,Shanghai 200030,P.R.China
Abstract:In general-sum games, taking all agent's collective rationality into account, we define agents' global objective,and propose a novel multi-agent reinforcement learning(RL) algorithm based on global policy. In each learning step, all agents commit to select the global policy to achieve the global goal. We prove this learning algorithm converges given certain restrictions on stage games of learned Q values, and show that it has quite lower computation time complexity than already developed multi-agent learning algorithms for general-sum games. An example is analyzed to show the algorithm' s merits.
Keywords:Markov games  reinforcement learning  collective rationality  policy  
本文献已被 CNKI 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号