排序方式: 共有4条查询结果,搜索用时 0 毫秒
1
1.
2.
魏力仁 《系统科学与复杂性》1988,(2)
The total discounted return v((π,δ)_T.β)which can be obtained by replacing nonsta-tionary v(π,β)by stationary v(δ,β)from stage T on is said to be a T-approximations fornonstationary v(π,β).In this paper,if the bounds on error are given,we can get some resultsabout the determination of both a value T and a T-stationary replacement.Our algorithm canbe regarded as an extension of White's method (1985) of reward revision of stationary versionto the nonstationary case. 相似文献
3.
4.
基于Kurano关于周期平稳平均模型的工作,本文讨论非平稳MDP平均模型 本文在下面两个假设下讨论。 假设A 存在常数M_1使得 相似文献
1