首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于先验知识的强化学习系统
引用本文:李伟,何雪松,叶庆泰,朱昌明.基于先验知识的强化学习系统[J].上海交通大学学报,2004,38(8):1362-1365.
作者姓名:李伟  何雪松  叶庆泰  朱昌明
作者单位:上海交通大学,机械与动力工程学院,上海,200030
基金项目:国家自然科学基金资助项目(69975013)
摘    要:针对强化学习算法收敛速度慢的主要原因为强化学习算法所用模型通常都假设系统参数未知、先验知识未知,由此造成该算法从没有任何基础开始搜索最优策略,搜索范围大的问题,提出将强化学习系统建立在先验知识的基础上,既有效利用了前人的工作成果,又加快了算法的收敛速度.通过解决电梯群控问题验证了所提出系统的合理性和有效性.

关 键 词:先验知识  强化学习  电梯  控制系统
文章编号:1006-2467(2004)08-1362-04
修稿时间:2003年7月13日

Prior Knowledge Based Reinforcement Learning System
LI Wei,HE Xue-song,YE Qing-tai,ZHU Chang-ming.Prior Knowledge Based Reinforcement Learning System[J].Journal of Shanghai Jiaotong University,2004,38(8):1362-1365.
Authors:LI Wei  HE Xue-song  YE Qing-tai  ZHU Chang-ming
Abstract:The slow rate of convergence is the main disadvantage of the reinforcement learning (RL) algerithm, which results from that the reinforcement learning model is based on model-free and always assuming any prior knowledge unknown. This makes the algorithm must devise a new policy from scratch. This paper presented an RL system based on prior knowledge relevant to the system, which avoids the duplication of these efforts and accelerates the convergence. An elevator domain knowledge based RL system was developed and it has a good performance.
Keywords:prior knowledge  reinforcement learning  elevators  control system
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号