首页 | 本学科首页   官方微博 | 高级检索  
     检索      

Q(f)-过程非唯一时连续时间折扣目标MDP
引用本文:郭先平.Q(f)-过程非唯一时连续时间折扣目标MDP[J].湖南师范大学自然科学学报,1996(3).
作者姓名:郭先平
作者单位:湖南师范大学数学系
基金项目:国家自然科学基金,湖南省自然科学基金
摘    要:考虑的是可数状态空间连续时间MDP的折扣模型,与以往不同的是,我们放弃了由策略f所确定的Q(f)-过程唯一的传统假设.而首次考虑Q(f)-过程非唯一的情形,借助于Q-过程的构造理论,用拓扑分析的方法,证明了最优策略的存在性.

关 键 词:连续时间MDP  非唯一  折扣目标  最优策略

Countinuous Time MDP with Discounted Reward Criterion-the Case of Q(f)-Processes being not Unique
Guo Xianping.Countinuous Time MDP with Discounted Reward Criterion-the Case of Q(f)-Processes being not Unique[J].Journal of Natural Science of Hunan Normal University,1996(3).
Authors:Guo Xianping
Abstract:We consider MDP discounted model with countable state space.The difference between this paper and others is that we give up the conditions that Q(f)-processes decided by policy f are unique in this paper,and first probe into the case of Q(f)-processes being not unique.By the structure theory of Q(f)-processes,we prove the existence of optimalpolicies in topological and analytic method.
Keywords:countinuous time MDP  not uniqueness  discounted reward criterion  optimal policies
本文献已被 CNKI 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号