首页 | 本学科首页   官方微博 | 高级检索  
     

面向语言评价的Takagi-Sugeno模糊再励学习
引用本文:晏雄伟,邓志东,孙增圻. 面向语言评价的Takagi-Sugeno模糊再励学习[J]. 清华大学学报(自然科学版), 2002, 42(10): 1393-1396
作者姓名:晏雄伟  邓志东  孙增圻
作者单位:清华大学,计算机科学与技术系,智能技术与系统国家重点实验室,北京,100084
基金项目:国家“九七三”重点基础研究发展规划项目( G19990 32 70 7)
摘    要:综合考虑再励学习的两个重要子问题 :连续空间及语言评价问题 ,提出了一种新的学习方法 ,即面向语言评价的 Takagi-Sugeno(T-S)模糊再励学习。该学习智能体构建在 Q-学习方法和 Takagi-Sugeno模糊推理系统的基础上 ,适于处理连续域的复杂学习任务 ,亦可用于设计 Takagi-Sugeno模糊逻辑控制器。以二级倒立摆控制系统为例 ,仿真研究验证了学习算法的有效性

关 键 词:再励学习  语言评价  T-S模糊推理系统  神经-模糊控制  函数逼近  Q-学习  模糊数
文章编号:1000-0054(2002)10-1393-04
修稿时间:2001-03-11

Linguistic reward-oriented T-S fuzzy reinforcement learning
YAN Xiongwei,DENG Zhidong,SUN Zengqi. Linguistic reward-oriented T-S fuzzy reinforcement learning[J]. Journal of Tsinghua University(Science and Technology), 2002, 42(10): 1393-1396
Authors:YAN Xiongwei  DENG Zhidong  SUN Zengqi
Abstract:This paper presents a learning method to simultaneously resolve two significant sub problems in reinforcement learning: continuous space and linguistic rewards. A linguistic reward oriented Takagi Sugeno fuzzy reinforcement learning (LRTSFRL) model was constructed by combining the Q learning method with Takagi Sugeno type fuzzy inference systems. The proposed method is capable of solving complicated learning tasks in continuous domains and can be used to design Takagi Sugeno fuzzy logic controllers. Experiments with the double inverted pendulum system demonstrated the improved performance of the scheme.
Keywords:reinforcement learning  linguistic rewards  Takagi Sugeno fuzzy inference systems  neuro fuzzy control  function approximations  Q learning  fuzzy number
本文献已被 CNKI 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号