首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于深度强化学习的网络路由优化方法
引用本文:孟泠宇,郭秉礼,杨雯,张欣伟,赵柞青,黄善国.基于深度强化学习的网络路由优化方法[J].系统工程与电子技术,2022,44(7):2311-2318.
作者姓名:孟泠宇  郭秉礼  杨雯  张欣伟  赵柞青  黄善国
作者单位:1. 北京邮电大学电子工程学院, 北京 1008762. 信息光子学与光通信国家重点实验室, 北京 100876
基金项目:国家自然科学基金(61771074);国家重点研发计划(2018YFB1801702)
摘    要:针对同一网络拓扑下不同网络负载的路由优化问题, 在深度强化学习方法的基础上, 提出了两种依据当前网络流量状态进行路由分配的优化方法。通过网络仿真系统与深度强化学习模型的迭代交互, 实现了对于流量关系分布的网络路由持续训练与优化。在利用深度确定性策略梯度(deep deterministec policy gradient, DDPG)算法解决路由优化问题上进行了提升和改进, 使得该优化方法更适合解决网络路由优化的问题。同时, 设计了一种全新的链路权重构造策略, 利用网络流量构造出用于神经网络输入状态元素, 通过对原始数据的预处理加强了神经网络的学习效率, 大大提升了训练模型的稳定性。并针对高纬度大规模网络的连续动作空间进行了动作空间离散化处理, 有效降低了其动作空间的复杂度, 加快了模型收敛速度。实验结果表明, 所提优化方法可以适应不断变化的流量和链路状态, 增强模型训练的稳定性并提升网络性能。

关 键 词:深度强化学习  路由优化  深度确定性策略梯度算法  
收稿时间:2021-06-29

Network routing optimization approach based on deep reinforcement learning
Lingyu MENG,Bingli GUO,Wen YANG,Xinwei ZHANG,Zuoqing ZHAO,Shanguo HUANG.Network routing optimization approach based on deep reinforcement learning[J].System Engineering and Electronics,2022,44(7):2311-2318.
Authors:Lingyu MENG  Bingli GUO  Wen YANG  Xinwei ZHANG  Zuoqing ZHAO  Shanguo HUANG
Institution:1. School of Electronic Engineering, Beijing University of Posts and Telecommunications, Beijing 100876, China2. State Key Laboratory of Information Photonics and Optical Communication, Beijing 100876, China
Abstract:Aiming at the routing optimization problem of different network loads under the same network topology, based on the deep reinforcement learning method, two optimization methods for routing distribution based on the current network traffic state are proposed. Through the iterative interaction between the network simulation system and the deep reinforcement learning model, continuous training and optimization of network routing for the distribution of traffic relationships are realized. Improvements have been made in using the deep deterministec policy gradient (DDPG) algorithm to solve the routing optimization problem, making this optimization method more suitable for solving the problem of network routing optimization. At the same time, a brand-new link weight construction strategy is designed, which uses network traffic to construct input state elements for the neural network. Through the preprocessing of the original data, the learning efficiency of the neural network is strengthened, and the stability of the training model is greatly improved. And for the continuous action space of the high-latitude large-scale network, the action space is discretized, which effectively reduces the complexity of the action space and speeds up the model convergence. Experimental results show that the proposed optimization method can adapt to changing traffic and link status, enhance the stability of model training and improve network performance.
Keywords:deep reinforcement learning  routing optimization  deep deterministec policy gradient (DDPG) algorithm  
点击此处可从《系统工程与电子技术》浏览原始摘要信息
点击此处可从《系统工程与电子技术》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号