Q（f）－过程非唯一时连续时间折扣目标MDP Countinuous Time MDP with Discounted Reward Criterion-the Case of Q(f)-Processes being not Unique期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

Q（f）－过程非唯一时连续时间折扣目标MDP

引用本文：	郭先平.Q（f）－过程非唯一时连续时间折扣目标MDP[J].湖南师范大学自然科学学报,1996(3).

作者姓名：	郭先平

作者单位：	湖南师范大学数学系

基金项目：	国家自然科学基金,湖南省自然科学基金

摘要：	考虑的是可数状态空间连续时间ＭＤＰ的折扣模型，与以往不同的是，我们放弃了由策略ｆ所确定的Ｑ（ｆ）－过程唯一的传统假设．而首次考虑Ｑ（ｆ）－过程非唯一的情形，借助于Ｑ－过程的构造理论，用拓扑分析的方法，证明了最优策略的存在性．
关键词：	连续时间MDP 非唯一折扣目标最优策略
Countinuous Time MDP with Discounted Reward Criterion-the Case of Q(f)-Processes being not Unique

Guo Xianping.Countinuous Time MDP with Discounted Reward Criterion-the Case of Q(f)-Processes being not Unique[J].Journal of Natural Science of Hunan Normal University,1996(3).

Authors:	Guo Xianping

Abstract:	We consider MDP discounted model with countable state space.The difference between this paper and others is that we give up the conditions that Q(f)-processes decided by policy f are unique in this paper,and first probe into the case of Q(f)-processes being not unique.By the structure theory of Q(f)-processes,we prove the existence of optimalpolicies in topological and analytic method.

Keywords:	countinuous time MDP not uniqueness discounted reward criterion optimal policies
本文献已被 CNKI 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏