首页 | 本学科首页   官方微博 | 高级检索  
     检索      

半Markov控制过程在平均准则下的优化算法
引用本文:代桂平,殷保群,李衍杰,周亚平,奚宏生.半Markov控制过程在平均准则下的优化算法[J].中国科学技术大学学报,2005,35(2):202-207.
作者姓名:代桂平  殷保群  李衍杰  周亚平  奚宏生
作者单位:1. 中国科学技术大学自动化系,安徽合肥,230027
2. 中国科学技术大学管理科学系,安徽合肥,230027
基金项目:国家自然科学基金(60274012),安徽省自然科学基金(01042308)资助项目.
摘    要:研究了一类半Markov控制过程(SMCP)在紧致行动集上关于无限水平平均代价准则的性能优化算法.利用等价Markov过程的方法,导出了SMCP的性能势公式和平均代价最优性方程,给出了求解最优或次最优平稳策略的策略迭代算法和数值迭代算法,并证明了算法的收敛性.最后给出了一个数值例子来说明算法的应用.

关 键 词:半Markov控制过程  紧致行动集  性能势  策略迭代  数值迭代
文章编号:0253-2778(2005)02-0202-06
修稿时间:2003年12月16

Optimization Algorithms for Semi-Markov Control Processes With Average Criteria
DAI Gui-ping,YIN Bao-qun,LI Yan-Jie,ZHOU Ya-ping,XI Hong-sheng.Optimization Algorithms for Semi-Markov Control Processes With Average Criteria[J].Journal of University of Science and Technology of China,2005,35(2):202-207.
Authors:DAI Gui-ping  YIN Bao-qun  LI Yan-Jie  ZHOU Ya-ping  XI Hong-sheng
Institution:DAI Gui-ping 1,YIN Bao-qun 1,LI Yan-jie 1,ZHOU Ya-ping 2,XI Hong-sheng 1
Abstract:Optimization algorithms are studied for a class of semi-Markov control processes (SMCPs) with infinite horizon average-cost criteria and compact action sets. By the equivalent Markov process, formulas of performance potentials and average-cost optimality equations for SMCPs are derived, and a policy iteration algorithm and a value iteration algorithm are proposed, which can lead to an optimal or suboptimal stationary policy in a finite number of iterations. The convergence of these algorithms is established, without the assumption of the corresponding iteration operator being an sp-contraction. A numerical example is provided to illustrate the application of the algorithms.
Keywords:semi-Markov control processes  compact action set  performance potentials  policy iteration  value iteration
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号