POMDP基于点的值迭代算法中一种信念选择方法 A Belief Selection Method in POMDP Point-Based Value Iteration Algorithm期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

POMDP基于点的值迭代算法中一种信念选择方法

引用本文：	冯奇,周雪忠,黄厚宽,张小平.POMDP基于点的值迭代算法中一种信念选择方法[J].北京交通大学学报(自然科学版),2009,33(5).

作者姓名：	冯奇周雪忠黄厚宽张小平

作者单位：	北京交通大学计算机与信息技术学院,北京,100044;北京交通大学计算机与信息技术学院,北京,100044;北京交通大学计算机与信息技术学院,北京,100044;北京交通大学计算机与信息技术学院,北京,100044

基金项目：	国家自然科学基金资助项目，国家"973"项目资助，北京市科委重大计划项目资助，国家科技支撑计划项日资助

摘要：	部分可观察马尔可夫决策过程(POMDP)是描述不确定环境下进行决策的数学模型.基于点的值迭代算法是求解POMDP问题的一类近似解法.针对基于点的算法中信念选择这一关键问题,提出了一种基于熵的信念选择方法(EBBS).EBBS算法通过计算可以转移到的信念点的不确定性,选择熵较小且到当前信念点集距离大于一定阈值的信念点扩充信念点集合.实验结果表明,通过熵选择信念点的值迭代算法只需要在较少数量的信念点上进行值迭代操作就能得到预期的折扣报酬.
关键词：	POMDP 值迭代基于点的算法信念选择不确定性
A Belief Selection Method in POMDP Point-Based Value Iteration Algorithm

FENG Qi,ZHOU Xuezhong,HUANG Houkuan,ZHANG Xiaoping.A Belief Selection Method in POMDP Point-Based Value Iteration Algorithm[J].JOURNAL OF BEIJING JIAOTONG UNIVERSITY,2009,33(5).

Authors:	FENG Qi ZHOU Xuezhong HUANG Houkuan ZHANG Xiaoping

Institution:	FENG Qi,ZHOU Xuezhong,HUANG Houkuan,ZHANG Xiaoping(School of Computer , Information Technology,Beijing Jiaotong University,Beijing 100044,China)

Abstract:	Partially Observable Markov Decision Process(POMDP) provides a mathematical model for decision making under uncertainty.Point-Based value iteration algorithms are effective proximate algorithms to solve POMDP problems.In this paper we propose a belief selection method,Entropy-Based Belief Selection(EBBS),based on the entropy of belief points to the crucial issue of point-based algorithms.The EBBS algorithm first sorts the belief points by entropy and then selects belief that has lower entropy and whose dist...

Keywords:	POMDP
本文献已被 CNKI 万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏