面向语言评价的Takagi-Sugeno模糊再励学习 Linguistic reward-oriented T-S fuzzy reinforcement learning期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

面向语言评价的Takagi-Sugeno模糊再励学习

引用本文：	晏雄伟,邓志东,孙增圻. 面向语言评价的Takagi-Sugeno模糊再励学习[J]. 清华大学学报(自然科学版), 2002, 42(10): 1393-1396

作者姓名：	晏雄伟邓志东孙增圻

作者单位：	清华大学,计算机科学与技术系,智能技术与系统国家重点实验室,北京,100084

基金项目：	国家“九七三”重点基础研究发展规划项目( G19990 32 70 7)

摘要：	综合考虑再励学习的两个重要子问题 :连续空间及语言评价问题 ,提出了一种新的学习方法 ,即面向语言评价的 Takagi-Sugeno(T-S)模糊再励学习。该学习智能体构建在 Q-学习方法和 Takagi-Sugeno模糊推理系统的基础上 ,适于处理连续域的复杂学习任务 ,亦可用于设计 Takagi-Sugeno模糊逻辑控制器。以二级倒立摆控制系统为例 ,仿真研究验证了学习算法的有效性
关键词：	再励学习语言评价 T-S模糊推理系统神经-模糊控制函数逼近 Q-学习模糊数
文章编号：	1000-0054(2002)10-1393-04
修稿时间：	2001-03-11
Linguistic reward-oriented T-S fuzzy reinforcement learning

YAN Xiongwei,DENG Zhidong,SUN Zengqi. Linguistic reward-oriented T-S fuzzy reinforcement learning[J]. Journal of Tsinghua University(Science and Technology), 2002, 42(10): 1393-1396

Authors:	YAN Xiongwei DENG Zhidong SUN Zengqi

Abstract:	This paper presents a learning method to simultaneously resolve two significant sub problems in reinforcement learning: continuous space and linguistic rewards. A linguistic reward oriented Takagi Sugeno fuzzy reinforcement learning (LRTSFRL) model was constructed by combining the Q learning method with Takagi Sugeno type fuzzy inference systems. The proposed method is capable of solving complicated learning tasks in continuous domains and can be used to design Takagi Sugeno fuzzy logic controllers. Experiments with the double inverted pendulum system demonstrated the improved performance of the scheme.

Keywords:	reinforcement learning linguistic rewards Takagi Sugeno fuzzy inference systems neuro fuzzy control function approximations Q learning fuzzy number
本文献已被 CNKI 万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏