基于Stackelberg博弈与深度强化学习的计算卸载策略 Computation Offloading Strategy Based on Stackelberg Game and DRL期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于Stackelberg博弈与深度强化学习的计算卸载策略

引用本文：	周娴玮,龚启旭,余松森. 基于Stackelberg博弈与深度强化学习的计算卸载策略[J]. 系统仿真学报, 2023, 35(2): 372-385. DOI: 10.16182/j.issn1004731x.joss.21-1118

作者姓名：	周娴玮龚启旭余松森

作者单位：	华南师范大学软件学院,广东佛山 528225

基金项目：	广东省应用型科技研发重大专项(2016B020244003);广东省基础与应用基础研究基金(2020B1515120089);广东省企业科技特派员项目(GDKTP2020014000)

摘要：	为使5G混合专网结构的2种用户能获得最优计算卸载策略,将2种用户竞争移动边缘计算(mobile edge computing,MEC)服务器资源的问题建模成Stackelberg博弈,并分别讨论了完全信息博弈和不完全信息博弈下的策略。完全信息博弈下,存在唯一纳什均衡解;不完全信息博弈下,将环境建模为部分可观测的马尔可夫决策过程(partially observable Markov decision process,POMDP),并提出一种基于二阶段深度强化学习(two-stage deep reinforcement learning,TSDRL)的最优卸载策略。仿真实验表明：该算法相较于D-DRL算法能减少20.81%的时延及3.38%的能耗,有效提高用户QoE(quality of experience)。
关键词：	5G混合专网计算卸载 Stackelberg博弈 Nash均衡马尔可夫决策过程
收稿时间：	2021-11-02
Computation Offloading Strategy Based on Stackelberg Game and DRL

Xianwei Zhou,Qixu Gong,Songsen Yu. Computation Offloading Strategy Based on Stackelberg Game and DRL[J]. Journal of System Simulation, 2023, 35(2): 372-385. DOI: 10.16182/j.issn1004731x.joss.21-1118

Authors:	Xianwei Zhou Qixu Gong Songsen Yu

Affiliation:	School of Software, South China Normal University, Foshan 528225, China

Abstract:	To achieve the optimal computation offloading strategy for two kinds of MEC users in 5G hybrid private network, Stackelberg game is used to build the model of the competition for MEC server resources of two kinds of users, andthe strategies of complete information game and partially incomplete information game are researched respectively. It is proved that there is only one Nash equilibrium solution in the complete information scenario. In the incomplete information scenario, the environment is modeled as POMDP, and a two-stage deep reinforcement learning(TSDRL) is proposed to obtain the optimal computation offloading strategy. Simulation results show the proposed algorithm having a total reduction of 20.81% time delay and 3.38 % energy consumption compared with the D-DRL algorithm and can effectively improve the user QoE(quality of experience).

Keywords:	5G hybrid private network computation offloading Stackelberg game theory Nash equilibrium partially observable Markov decision process(POMDP)

	点击此处可从《系统仿真学报》浏览原始摘要信息
	点击此处可从《系统仿真学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏