首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于深度强化学习的大型活动关键交叉口信号控制
引用本文:宋太龙,贺玉龙,刘钦.基于深度强化学习的大型活动关键交叉口信号控制[J].科学技术与工程,2023,23(22):9694-9701.
作者姓名:宋太龙  贺玉龙  刘钦
作者单位:北京工业大学;北京工业大学北京市交通工程重点实验室
基金项目:国家重点研发计划(2017YFC0803903)
摘    要:大型活动举办时期,场馆周边路网的交通压力与日常交通运行状态存在差异,活动场馆周边关键交叉口的正常运行是保证大型活动顺利举办的重要因素之一,应采取动态的管控方式以达到提高关键交叉口通行效率、满足参与大型活动出行者交通需求的目的。为此,文中基于A2C(Advantage Actor Critic)的强化学习算法,考虑大型活动背景下出行者数量大且大多采用公共交通出行的特点,在奖励函数构建过程中将车辆排队时间细分为出行者不同出行方式的车辆等待时间,通过引入参数,修正不同车型的奖励计算方法,使智能体在信号配时优化的过程中优先考虑大型活动参与者的出行需求。最后,以北京市首都体育馆周边大型交叉口为例,借助交通流仿真软件SUMO进行仿真实验,仿真实验结果证明,修改奖励函数结构后的A2C信号控制方法在控制效果上优于定时信号控制以及基于DQN(Deep-Q-Network)算法的控制方法,可以达到提高交叉口公共交通以及整体车流通行效率的目的。

关 键 词:大型活动    关键交叉口    深度强化学习    信号控制
收稿时间:2022/7/4 0:00:00
修稿时间:2023/5/12 0:00:00

Deep reinforcement learning-based signal control at critical intersections for large events
Song Tailong,He Yulong,Liu Qin.Deep reinforcement learning-based signal control at critical intersections for large events[J].Science Technology and Engineering,2023,23(22):9694-9701.
Authors:Song Tailong  He Yulong  Liu Qin
Institution:Beijing University Of Technology
Abstract:The normal operation of key intersections around the event site is one of the most important factors in ensuring the smooth running of the event. A dynamic control method should be adopted to improve the efficiency of key intersections and meet the traffic demand of the event participants. Based on the reinforcement learning algorithm of A2C (Advantage Actor Critic), the characteristics of a large number of travelers was considered in the context of large events and most of them used public transport to travel, and subdivided the vehicle waiting time into the vehicle waiting time of different travel modes of travelers in the process of reward function construction. The priority to the travel demand of large event participants was given by Agent in the process of signal timing optimization. Finally, simulation experiments were conducted using the SUMO traffic flow simulation software at the large intersection around the Beijing Capital Stadium. The simulation experiment results prove that the A2C signal control method with modified reward function structure is better than the timing signal control method and the control method based on DQN (Deep-Q-Network) algorithm in terms of control effect, and the goal of improving public transportation at the intersection and the overall traffic flow at intersections is achieved.
Keywords:large events      key intersections      deep reinforcement learning      signal control
点击此处可从《科学技术与工程》浏览原始摘要信息
点击此处可从《科学技术与工程》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号