基于预测状态表示的Q学习算法 Q-Learning Algorithm Based on Predictive State Representations期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

基于预测状态表示的Q学习算法

引用本文：	刘云龙,李人厚,刘建书.基于预测状态表示的Q学习算法[J].西安交通大学学报,2008,42(12).

作者姓名：	刘云龙李人厚刘建书

作者单位：	西安交通大学系统工程研究所,710049,西安

基金项目：	国家"211"工程建设项目，面向21世纪教育振兴行动计划(985计划)

摘要：	针对不确定环境的规划问题,提出了基于预测状态表示的Q学习算法.将预测状态表示方法与Q学习算法结合,用预测状态表示的预测向量作为Q学习算法的状态表示,使得到的状态具有马尔可夫特性,满足强化学习任务的要求,进而用Q学习算法学习智能体的最优策略,可解决不确定环境下的规划问题.仿真结果表明,在发现智能体的最优近似策略时,算法需要的学习周期数与假定环境状态已知情况下需要的学习周期数大致相同.
关键词：	不确定环境规划预测状态表示 Q学习算法奶酪迷宫
Q-Learning Algorithm Based on Predictive State Representations

LIU Yunlong,LI Renhou,LIU Jianshu.Q-Learning Algorithm Based on Predictive State Representations[J].Journal of Xi'an Jiaotong University,2008,42(12).

Authors:	LIU Yunlong LI Renhou LIU Jianshu

Abstract:	A Q-learning algorithm based on predictive state representations is proposed for solving the problem of planning under uncertainty.The predictive state representations is combined with the Q-learning algorithm.The prediction vector of predictive state representations is used as the state representation of Q-learning algorithms,so that the obtained states have the Markov properties and satisfy the requirement of reinforcement learning tasks.Then the Q-learning algorithm is used to find the optimal policy and...

Keywords:	planning under uncertainty predictive state representation Q-learning algorithm cheese maze
本文献已被 CNKI 维普万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏