首页 | 本学科首页   官方微博 | 高级检索  
     检索      

折扣模型马尔可夫决策规划的一种算法
引用本文:刘迪芬.折扣模型马尔可夫决策规划的一种算法[J].湖南师范大学自然科学学报,1987(4).
作者姓名:刘迪芬
作者单位:湖南师范大学数学系
摘    要:本文对有限状态和决策折扣模型马氏决策规划构造出了一个新的算子,得到另外一种求最优报酬向量的分块逐次逼近算法。这种算法优越于通常的标准逼近算法而又具有同样的适用范围。以此算法为基础,加上报酬修改法便得到报酬修改的分块逐次逼近算法,从而大大减少了计算的工作量。

关 键 词:马尔可夫决策  规划论  逐次逼近法/折扣模型  决策函数  压缩算子  报酬向量  拟可分矩阵  报酬修改

A NUMERICAL PROCEDURE FOR DISCOUNTED MODEL MARKOV DECISION PROGRAMMING
Liu Difen.A NUMERICAL PROCEDURE FOR DISCOUNTED MODEL MARKOV DECISION PROGRAMMING[J].Journal of Natural Science of Hunan Normal University,1987(4).
Authors:Liu Difen
Institution:Liu Difen Department of Mathematics
Abstract:In this paper we present a new operator so that we arrive at a block-successiveapproximation brocedure to compute the optimal total expected discounted rewardvector for finite state and action, discrete time Markov decision programming. Thisnumerical procedure is superior to the standard method and can be used in the samefield as that in which the standard method is used. On the basis of this numericalprocedure and by adding the reward revision, we get the reward revision block-successive approximation numerical method which helps us reduce the times ofoperation without complicating operation.
Keywords:Markovian decision  programming theory  successive approximation/discounted model  action function  compression operator  reward vector  nearly decomposable matrix  reward revision
本文献已被 CNKI 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号