首页 | 本学科首页   官方微博 | 高级检索  
     检索      

多变量环境下基于递阶模糊神经网络的强化学习
引用本文:张文志,吕恬生,王乐天.多变量环境下基于递阶模糊神经网络的强化学习[J].上海交通大学学报,2004,38(9):1557-1561.
作者姓名:张文志  吕恬生  王乐天
作者单位:上海交通大学,机械与动力工程学院,上海,200030
摘    要:针对多变量连续空间学习问题的复杂性,给出了一种采用递阶模糊神经网络(HFNN)的强化学习方法,两个结构相同的HFNN分别同时完成模糊动作的合成以及值函数的逼近,网络参数通过梯度下降法在线调整.该方法有效地解决了在多变量环境下所遇到的规则组合爆炸问题,减少了运算量和存储量.HFNN前一阶的输出不再作为下一阶的前件,而直接用于其结论部分,克服了前一阶输出含义不明确或没有含义所带来的设计问题.通过仿真二级倒立摆验证表明,所给出方法是正确可行的.

关 键 词:模糊系统  递阶模糊神经网络  强化学习  二级倒立摆
文章编号:1006-2467(2004)09-1557-05
修稿时间:2003年10月27

Hierarchical Fuzzy Neural-Networks Based Reinforcement Learning in Multi-Variable Environment
ZHANG Wen-zhi,L.Hierarchical Fuzzy Neural-Networks Based Reinforcement Learning in Multi-Variable Environment[J].Journal of Shanghai Jiaotong University,2004,38(9):1557-1561.
Authors:ZHANG Wen-zhi  L
Abstract:A reinforcement learning approach based on hierarchical fuzzy neural-networks(HFNN)for solving complicated learning task in continuous multi-variable environment was proposed. Two HFNNs with same structure perform fuzzy action composition and evaluation function approximation simultaneously. The parameters of neural-networks are tuned and updated on line by gradient descent algorithm. The proposed method can successfully solve the problem of rules combination exploration and decrease the quantity of computation and memory requirement. The output of previous layer in the HFNN is no longer used as IF part of the next layer, but used only in THEN part. Thus it can deal with the difficulty when the output of previous layer is meaningless or its meaning is uncertain. The reinforcement learning method is proved to be correct and feasible by the simulation of double inverted pendulum balance problem.
Keywords:fuzzy system  hierarchical fuzzy neural-networks  reinforcement learning  double inverted pendulum
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号