首页 | 本学科首页   官方微博 | 高级检索  
     

基于Stackelberg博弈与深度强化学习的计算卸载策略
引用本文:周娴玮,龚启旭,余松森. 基于Stackelberg博弈与深度强化学习的计算卸载策略[J]. 系统仿真学报, 2023, 35(2): 372-385. DOI: 10.16182/j.issn1004731x.joss.21-1118
作者姓名:周娴玮  龚启旭  余松森
作者单位:华南师范大学 软件学院,广东 佛山 528225
基金项目:广东省应用型科技研发重大专项(2016B020244003);广东省基础与应用基础研究基金(2020B1515120089);广东省企业科技特派员项目(GDKTP2020014000)
摘    要:为使5G混合专网结构的2种用户能获得最优计算卸载策略,将2种用户竞争移动边缘计算(mobile edge computing,MEC)服务器资源的问题建模成Stackelberg博弈,并分别讨论了完全信息博弈和不完全信息博弈下的策略。完全信息博弈下,存在唯一纳什均衡解;不完全信息博弈下,将环境建模为部分可观测的马尔可夫决策过程(partially observable Markov decision process,POMDP),并提出一种基于二阶段深度强化学习(two-stage deep reinforcement learning,TSDRL)的最优卸载策略。仿真实验表明:该算法相较于D-DRL算法能减少20.81%的时延及3.38%的能耗,有效提高用户QoE(quality of experience)。

关 键 词:5G混合专网  计算卸载  Stackelberg博弈  Nash均衡  马尔可夫决策过程
收稿时间:2021-11-02

Computation Offloading Strategy Based on Stackelberg Game and DRL
Xianwei Zhou,Qixu Gong,Songsen Yu. Computation Offloading Strategy Based on Stackelberg Game and DRL[J]. Journal of System Simulation, 2023, 35(2): 372-385. DOI: 10.16182/j.issn1004731x.joss.21-1118
Authors:Xianwei Zhou  Qixu Gong  Songsen Yu
Affiliation:School of Software, South China Normal University, Foshan 528225, China
Abstract:To achieve the optimal computation offloading strategy for two kinds of MEC users in 5G hybrid private network, Stackelberg game is used to build the model of the competition for MEC server resources of two kinds of users, andthe strategies of complete information game and partially incomplete information game are researched respectively. It is proved that there is only one Nash equilibrium solution in the complete information scenario. In the incomplete information scenario, the environment is modeled as POMDP, and a two-stage deep reinforcement learning(TSDRL) is proposed to obtain the optimal computation offloading strategy. Simulation results show the proposed algorithm having a total reduction of 20.81% time delay and 3.38 % energy consumption compared with the D-DRL algorithm and can effectively improve the user QoE(quality of experience).
Keywords:5G hybrid private network  computation offloading  Stackelberg game theory  Nash equilibrium  partially observable Markov decision process(POMDP)  
点击此处可从《系统仿真学报》浏览原始摘要信息
点击此处可从《系统仿真学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号