首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于深度强化学习的分布式能源系统运行优化
引用本文:阮应君,侯泽群,钱凡悦,孟华.基于深度强化学习的分布式能源系统运行优化[J].科学技术与工程,2022,22(17):7021-7030.
作者姓名:阮应君  侯泽群  钱凡悦  孟华
作者单位:同济大学机械与能源工程学院
基金项目:国家自然科学基金项目(No.51978482)
摘    要:分布式能源系统凭借其高效、环保、经济、可靠、和灵活等特点成为我国能源未来发展的重要方向。目前我国的很多分布式能源系统经济效益较差,主要原因是能源系统没有良好的运行策略。本文提出一种基于深度强化学习的分布式能源系统运行优化方法。首先,对分布式能源系统的各个设备进行数学建模。深入阐述了强化学习的基本原理、深度学习对强化学习的结合原理及一种基于演员评论家算法的分布式近端策略优化(Distributed Proximal Policy Optimization, DPPO)算法流程,将分布式能源系统运行优化问题转化为马尔可夫决策过程(Markov decision process,MDP)。最后采用历史的数据对智能体进行训练,训练完成的模型可以实现对本文的分布式能源系统的实时优化,并对比了深度Q网络(Deep Q Network, DQN)算法和LINGO获得的调度策略。结果表明,本文提出的基于DPPO算法的能源系统调度优化方法较DQN算法和LINGO得到的结果运行费用分别降低了7.12%和2.27%,可以实现能源系统的经济性调度。

关 键 词:深度强化学习  分布式近端策略优化  分布式能源系统  运行优化
收稿时间:2021/8/16 0:00:00
修稿时间:2022/3/5 0:00:00

Optimization of the operation of distributed energy system based on deep reinforcement learning
Ruan Yingjun,Hou Zequn,Qian Fanyue,Meng Hua.Optimization of the operation of distributed energy system based on deep reinforcement learning[J].Science Technology and Engineering,2022,22(17):7021-7030.
Authors:Ruan Yingjun  Hou Zequn  Qian Fanyue  Meng Hua
Institution:School of Mechanical Engineering,Tongji University;School of Mechanical Engineering,Tongji University
Abstract:Distributed energy system has become an important direction of China"s energy development in the future by virtue of its high efficiency, environmental protection, economy, reliability and flexibility. At present, many distributed energy systems in China have difficulties in making ends meet, the main reason is that the energy system has no good operation strategy. And in this paper, a method of distributed energy system operation optimization based on deep reinforcement learning is proposed. Firstly, mathematical model was established for each device of distributed energy system. Secondly, the basic principle of reinforcement learning, the combination principle of deep learning and reinforcement learning and a Distributed Proximal Policy Optimization (DPPO) algorithm based on actor critic algorithm were discussed in detail, and the operation optimization problem of distributed energy system was transformed into Markov Decision Process (MDP). Finally, the historical data were used to train the agent, and the model completed by training can realize the online optimization of the distributed energy system in this paper, and the scheduling strategy obtained by DQN (Deep Q Network) algorithm and LINGO was compared. The results show that compared with the results of DQN algorithm and LINGO, the operation cost of the proposed energy system scheduling optimization method based on DPPO algorithm is reduced by 7.12% and 2.27% respectively, which can realize the economic scheduling of the energy system.
Keywords:Deep reinforcement learning  Distributed Proximal Policy Optimization  Distributed energy system  Operation optimization
点击此处可从《科学技术与工程》浏览原始摘要信息
点击此处可从《科学技术与工程》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号