首页 | 本学科首页   官方微博 | 高级检索  
     检索      

Optimal Response Learning and Its Convergence in Multiagent Domains
作者姓名:张化祥  黄上腾  乐嘉锦
作者单位:[1]Department of Computer Science & Engineering, Shanghai Jiaotong University, Shanghai 200030 [2]Department of Computer Science & Engineering, Donghua University, Shanghai 200051
摘    要:In multiagent reinforcement learning, with different assumptions of the opponents' policies, an agent adopts quite different learning rules, and gets different learning performances. We prove that, in multiagent domains, convergence of the Q values is guaranteed only when an agent behaves optimally and its opponents' strategies satisfy certain conditions, and an agent can get best learning performances when it adopts the same learning algorithm as that of its opponents.

关 键 词:Agent  马尔可夫决策过程  加强学习  学习规则  多主体域
收稿时间:2003-12-30

Optimal Response Learning and Its Convergence in Multiagent Domains
ZHANG Hua-xiang,HUANG Shang-teng,LE Jia-jin.Optimal Response Learning and Its Convergence in Multiagent Domains[J].Journal of Donghua University,2005,22(3):116-119.
Authors:ZHANG Hua-xiang  HUANG Shang-teng  LE Jia-jin
Abstract:In multiagent reinforcement learning, with different assumptions of the opponents' policies, an agent adopts quite different learning rules, and gets different learning performances. We prove that, in multiagent domains, convergence of the Q values is guaranteed only when an agent behaves optimally and its opponents' strategies satisfy certain conditions, and an agent can get best learning performances when it adopts the same learning algorithm as that of its opponents.
Keywords:multiagent  learning  policy
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号