首页 | 本学科首页   官方微博 | 高级检索  
     

适用于强化学习惯性环境的分数阶改进OU噪声
引用本文:王涛,张卫华,蒲亦非. 适用于强化学习惯性环境的分数阶改进OU噪声[J]. 四川大学学报(自然科学版), 2023, 60(2): 022001-69
作者姓名:王涛  张卫华  蒲亦非
作者单位:四川大学计算机学院,成都610065
基金项目:四川省科技计划(2022YFQ0047)
摘    要:本文将DDPG算法中使用的Ornstein-Uhlenbeck (OU)噪声整数阶微分模型推广为分数阶OU噪声模型,使得噪声的产生不仅和前一步的噪声有关而且和前K步产生的噪声都有关联.通过在gym惯性环境下对比基于分数阶OU噪声的DDPG和TD3算法和原始的DDPG和TD3算法,我们发现基于分数阶微积分的OU噪声相比于原始的OU噪声能在更大范围内震荡,使用分数阶OU噪声的算法在惯性环境下具有更好的探索能力,收敛得更快.

关 键 词:DDPG算法  TD3算法  分数阶微积分  OU噪声  强化学习
收稿时间:2022-03-26
修稿时间:2022-06-12

An improved Ornstein-Uhlenbeck exploration noise based on fractional order calculus for reinforcement learning environments with momentum
WANG Tao,ZHANG Wei-Hua and PU Yi-Fei. An improved Ornstein-Uhlenbeck exploration noise based on fractional order calculus for reinforcement learning environments with momentum[J]. Journal of Sichuan University (Natural Science Edition), 2023, 60(2): 022001-69
Authors:WANG Tao  ZHANG Wei-Hua  PU Yi-Fei
Affiliation:College of Computer Science, Sichuan University,College of Computer Science, Sichuan University,College of Computer Science, Sichuan University
Abstract:In this paper, the integer-order Ornstein-Uhlenbeck (OU) noise model used in the deep deterministic policy gradient (DDPG) algorithm is extended to the fractional order OU noise model, and the generated noise is not only related to the noise of the previous step but also related to the noise generated in the previous K steps in the proposed model.The DDPG algorithm and twin delayed deep deterministic(TD3) algorithm using the fractional-order OU noise model were compared with the original DDPG algorithm and TD3 algorithm in the gym inertial environment. We found that, compared with the original OU noise, the fractional-order OU noise can oscillate in a wider range, and the algorithm using the fractional-order OU noise had better exploration ability and faster convergence in inertial environment.
Keywords:
本文献已被 万方数据 等数据库收录!
点击此处可从《四川大学学报(自然科学版)》浏览原始摘要信息
点击此处可从《四川大学学报(自然科学版)》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号