Multi-agent reinforcement learning based on policies of global objective Multi-agent reinforcement learning based on policies of global objective期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

Multi-agent reinforcement learning based on policies of global objective

作者姓名：	张化祥黄上腾

作者单位：	Dept .of Computer Science，Shandong Normal Univ，Dept .of Computer Science and Engineering，Shanghai Jiaotong Univ Jinan 250014，P. R. China，Shanghai 200030，P. R. China

摘要：	1 .INTRODUCTIONBecause an agent’s rewardis a function of all agents’joint action, when applying RL1]to multi-agent do-mains ,some fundamental change should be made .Byadopting single agentQlearning2]to Markovgames,several algorithms have been proposed,suchas Littman’s mini maxQ-learning( mini max-Q)3],Hu et al’s NashQ-learning(Nash-Q)4 ,5], Claus etal’s cooperative multi-agentQ-learning6], Bowlinget al’s multi-agent learningQ-learning using a vari-able learning rate7 ~9],…
Multi-agent reinforcement learning based on policies of global objective

Zhang Huaxiang,Huang Shangteng.Multi-agent reinforcement learning based on policies of global objective[J].Journal of Systems Engineering and Electronics,2005,16(3).

Authors:	Zhang Huaxiang Huang Shangteng

Institution:	1. Dept.of Computer Science,Shandong Normal Univ.,Jinan 250014,P.R.China 2. Dept.of Computer Science and Engineering,Shanghai Jiaotong Univ.,Shanghai 200030,P.R.China

Abstract:	In general-sum games, taking all agent's collective rationality into account, we define agents' global objective,and propose a novel multi-agent reinforcement learning(RL) algorithm based on global policy. In each learning step, all agents commit to select the global policy to achieve the global goal. We prove this learning algorithm converges given certain restrictions on stage games of learned Q values, and show that it has quite lower computation time complexity than already developed multi-agent learning algorithms for general-sum games. An example is analyzed to show the algorithm' s merits.

Keywords:	Markov games reinforcement learning collective rationality policy
本文献已被 CNKI 万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏