最小二乘支持向量机在强化学习系统中的应用 Application of Least Squares Support Vector Machine to Reinforcement Learning System期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

最小二乘支持向量机在强化学习系统中的应用

引用本文：	WANG Xue-song,田西兰,CHENG Yu-hu,马小平.最小二乘支持向量机在强化学习系统中的应用[J].系统仿真学报,2008,20(14).

作者姓名：	WANG Xue-song 田西兰 CHENG Yu-hu 马小平

作者单位：	中国矿业大学信息与电气工程学院,江苏,徐州,221008

基金项目：	教育部高等学校博士学科点专项科研基金，中国博士后科学基金，江苏省博士后科学基金，中国矿业大学校科研和教改项目，江苏省教育厅青蓝工程项目

摘要：	将连续状态空间下的Q学习构建为最小二乘支持向量机的回归估计问题,利用最小二乘支持向量机良好的泛化以及非线性逼近性能实现由系统状态-动作对到Q值函数的映射。为了保证计算速度以及适应Q学习系统在线学习的需要,最小二乘支持向量机的训练样本是窗式移动的,即在Q学习系统学习的同时获取样本数据并进行最小二乘支持向量机的训练。小车爬山控制问题的仿真结果表明该方法学习效率高,能够有效解决强化学习系统连续状态空间的泛化问题。
关键词：	最小二乘支持向量机强化学习 Q学习泛化
Application of Least Squares Support Vector Machine to Reinforcement Learning System

WANG Xue-song,TIAN Xi-lan,CHENG Yu-hu,MA Xiao-ping.Application of Least Squares Support Vector Machine to Reinforcement Learning System[J].Journal of System Simulation,2008,20(14).

Authors:	WANG Xue-song TIAN Xi-lan CHENG Yu-hu MA Xiao-ping

Abstract:	Q learning system for continuous state space was constructed as a regression problem for a least squares support machine (LS-SVM). This LS-SVM was used to realize a mapping from a state-action pair to a Q value function by taking full advantage of perfect generalization and nonlinear approximation properties of LS-SVM. In order to improve computation speed and satisfy the requirements of on-line Q learning system, training samples of the LS-SVM were introduced in the manner of a time-window, i.e., the training samples are obtained to train the LS-SVM at the same time as the Q learning system progresses. Simulation results of a mountain car control show that the proposed Q learning method is highly efficient and can effectively solve the generalization problem of continuous state space.

Keywords:	least squares support vector machine reinforcement learning Q learning generalization
本文献已被 CNKI 维普万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏