首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于平均报酬模型的强化学习算法研究
引用本文:黄炳强,曹广益,费燕琼,王占全.基于平均报酬模型的强化学习算法研究[J].上海理工大学学报,2006,28(5):418-422.
作者姓名:黄炳强  曹广益  费燕琼  王占全
作者单位:1. 上海交通大学,电子信息与电气工程学院,上海,200030
2. 上海交通大学,机械与动力工程学院,上海,200030
3. 华东理工大学,信息科学与工程学院,上海,200237
摘    要:对于有吸收目标状态的循环任务,比较合理的方法是采用基于平均报酬模型的强化学习.平均报酬模型强化学习具有收敛速度快、鲁棒性强等优点.本文介绍了平均报酬模型强化学习的3个主要算法:R学习、H学习和LC学习,并给出了平均报酬模型强化学习的主要应用及研究方向.

关 键 词:平均报酬强化学习  R学习  H学习  LC学习
文章编号:1007-6735(2006)05-0418-05
收稿时间:2005-12-05
修稿时间:2005年12月5日

Survey of average reinforcement learning algorithms
HUANG Bing-qiang,CAO Guang-yi,FEI Yan-qiong,WANG Zhan-quan.Survey of average reinforcement learning algorithms[J].Journal of University of Shanghai For Science and Technology,2006,28(5):418-422.
Authors:HUANG Bing-qiang  CAO Guang-yi  FEI Yan-qiong  WANG Zhan-quan
Institution:1. School of Electronic, Information and Electrical Engineering, Shanghai Jiaotong University, Shanghai 200030, China; 2. School of Mechanical Engineering, Shanghai Jiaotong University, Shanghai 200030, China ; 3. College of ln forvnation Science and EngineerinG, Fast China University of Science and Technology, Shanghai 200237, China
Abstract:It is rational to adopt the average reward reinforcement learning algorithms for solving the absorbing goal states cyclical tasks.It has the merit of converging quickly and robustly.A detailed study as regards average reward reinforcement learning including R-learning,Hlearning and LC-learning is presented and the application and future research are proposed.
Keywords:average reward reinforcement learning  R-learning  H-learning  LC-learning
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《上海理工大学学报》浏览原始摘要信息
点击此处可从《上海理工大学学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号