智能电网中基于多智能体强化学习的频谱分配算法

doi:10.11959/j.issn.1000-436x.2023179

通信学报 ›› 2023, Vol. 44 ›› Issue (9): 12-24.doi: 10.11959/j.issn.1000-436x.2023179

• 学术论文 • 上一篇

智能电网中基于多智能体强化学习的频谱分配算法

燕锋¹, 林晓薇², 李正浩³, 徐霞⁴, 夏玮玮¹, 沈连丰¹

¹ 东南大学移动通信全国重点实验室，江苏南京 210096
² 东南大学软件学院，江苏南京 211100
³ 国网山东省电力公司信息通信公司，山东济南 250001
⁴ 国网山东省电力公司济南供电公司，山东济南 250012

修回日期:2023-09-04 出版日期:2023-09-01 发布日期:2023-09-01
作者简介:燕锋（1983- ），男，湖北天门人，博士，东南大学副教授，主要研究方向为无人机自组网、卫星互联网、无线传感器网络等
林晓薇（1999- ），女，广西桂林人，东南大学硕士生，主要研究方向为无线网络资源管理、强化学习应用等
李正浩（1991- ），男，山东济南人，国网山东省电力公司信息通信公司工程师，主要研究方向为云计算、5G通信、数字化等方面
徐霞（1986- ），女，山东成武人，国网山东省电力公司济南供电公司高级工程师，主要研究方向为电力系统智慧物联网、网络资源优化等
夏玮玮（1975- ），女，江苏句容人，博士，东南大学副研究员，主要研究方向为无线网络资源管理、边缘计算、泛在网络与短距离无线通信等
沈连丰（1952- ），男，江苏邳州人，东南大学教授、博士生导师，主要研究方向为宽带移动通信、短距离无线通信和泛在网络等
基金资助:
国家电网有限公司科技基金资助项目(520601220022)

Spectrum allocation algorithm based on multi-agent reinforcement learning in smart grid

Feng YAN¹, Xiaowei LIN², Zhenghao LI³, Xia XU⁴, Weiwei XIA¹, Lianfeng SHEN¹

¹ National Mobile Communications Research Laboratory, Southeast University, Nanjing 210096, China
² School of Software, Southeast University, Nanjing 211100, China
³ State Grid Shandong Information and Telecommunication Company, Jinan 250001, China
⁴ State Grid Jinan Power Supply Company, Jinan 250012, China

Revised:2023-09-04 Online:2023-09-01 Published:2023-09-01
Supported by:
The Science and Technology Project of State Grid Corporation of China(520601220022)

摘要/Abstract

摘要：

针对智能电网中利用5G网络承载多样化电力终端的业务需求，提出了一种基于多智能体强化学习的频谱分配算法。首先，基于智能电网中部署的集成接入回程系统，考虑智能电网中轻量化和非轻量化终端业务的不同通信需求，将频谱分配问题建模为最大化系统总能效的非凸混合整数规划。其次，将前述问题构建为一个部分可观测的马尔可夫决策过程并转换为完全协作的多智能体问题，进而提出了一种集中训练分布执行框架下基于多智能体近端策略优化的频谱分配算法。最后，通过仿真验证了所提算法的性能。仿真结果表明，所提算法具有更快的收敛速度，通过有效减少层内与层间干扰、平衡接入与回程链路速率，可以将系统总速率提高25.2%。

关键词: 智能电网, 集成接入回程, 频谱分配, 多智能体强化学习

Abstract:

In view of the fact that 5G networks are used to meet the service requirements of various power terminals in smart grid, a spectrum allocation algorithm based on multi-agent reinforcement learning was proposed.Firstly, for the integrated access backhaul system deployed in smart grid, considering the different communication requirements of services in lightweight and non-lightweight terminal, the spectrum allocation problem was formulated as a non-convex mixed-integer programming aiming to maximize the overall energy efficiency.Secondly, the above problem was modeled as a partially observable Markov decision process and transformed into a fully cooperative multi-agent problem, then a spectrum allocation algorithm was proposed which was based on multi-agent proximal policy optimization under the framework of centralized training and distributed execution.Finally, the performance of the proposed algorithm was verified by simulation.The results show that the proposed algorithm has a faster convergence speed and can increase the overall transmission rate by 25.2% through effectively reducing intra-layer and inter-layer interference and balancing the access and backhaul link rates.

Key words: smart grid, integrated access and backhaul, spectrum allocation, multi-agent reinforcement learning

中图分类号:

TN92

燕锋, 林晓薇, 李正浩, 徐霞, 夏玮玮, 沈连丰. 智能电网中基于多智能体强化学习的频谱分配算法[J]. 通信学报, 2023, 44(9): 12-24.

Feng YAN, Xiaowei LIN, Zhenghao LI, Xia XU, Weiwei XIA, Lianfeng SHEN. Spectrum allocation algorithm based on multi-agent reinforcement learning in smart grid[J]. Journal on Communications, 2023, 44(9): 12-24.

图/表 11

图1

图2

表1

表2

图3

图4

图5

图6

图7

图8

图9

参考文献 30

[1]	DILEEP G . A survey on smart grid technologies and applications[J]. Renewable Energy, 2020,146: 2589-2625.
[2]	CHI Y Y , ZHANG Y , LIU Y ,et al. Deep reinforcement learning based edge computing network aided resource allocation algorithm for smart grid[J]. IEEE Access, 2022,11: 6541-6550.
[3]	ANDREWS J G , BUZZI S , CHOI W ,et al. What will 5G be？[J]. IEEE Journal on Selected Areas in Communications, 2014,32(6): 1065-1082.
[4]	KONG P Y . Radio resource allocation scheme for reliable demand response management using D2D communications in smart grid[J]. IEEE Transactions on Smart Grid, 2020,11(3): 2417-2426.
[5]	YANG J J , LIU G , REN J ,et al. Resource allocation for intelligent reflecting surface-assisted cooperative NOMA-URLLC networks in smart grid[C]// Proceedings of 2022 IEEE International Conference on Communications,Control,and Computing Technologies for Smart Grids. Piscataway:IEEE Press, 2022: 83-89.
[6]	CAO Z J , LIN J , WAN C ,et al. Optimal cloud computing resource allocation for demand side management in smart grid[J]. IEEE Transactions on Smart Grid, 2017,8(4): 1943-1955.
[7]	SUN M Y , YUAN Y Z , MA K ,et al. Spectrum allocation and computing resources optimization for demand-side cooperative communications in smart grid[J]. IEEE Transactions on Smart Grid, 2022,13(3): 1967-1975.
[8]	LI Z , LIANG Q L . Capacity optimization in heterogeneous home area networks with application to smart grid[J]. IEEE Transactions on Vehicular Technology, 2016,65(2): 699-706.
[9]	YIN F F , ZENG M Y , ZHANG Z L ,et al. Coded caching for smart grid enabled HetNets with resource allocation and energy cooperation[J]. IEEE Transactions on Vehicular Technology, 2020,69(10): 12058-12071.
[10]	LIU L L , ZHANG Z Z , WANG N ,et al. Online resource management of heterogeneous cellular networks powered by grid-connected smart micro grids[J]. IEEE Transactions on Wireless Communications, 2022,21(10): 8416-8430.
[11]	GUNGOR V C , SAHIN D , KOCAK T ,et al. A survey on smart grid potential applications and communication requirements[J]. IEEE Transactions on Industrial Informatics, 2013,9(1): 28-42.
[12]	3GPP. Study on integrated access and backhaul (release 16):TR 38.874[S]. 2018.
[13]	BELAID M N , AUDEBERT V , DENEUVILLE B ,et al. Smart grid critical traffic routing and link scheduling in 5G IAB networks[C]// Proceedings of 2022 IEEE International Conference on Communications,Control,and Computing Technologies for Smart Grid. Piscataway:IEEE Press, 2022: 76-82.
[14]	YASHIMA T , NISHIYAMA H . Analysis of optimal bandwidth partitioning ratio in full-duplex integrated access and backhaul[C]// Proceedings of 2022 IEEE International Conference on Communications Workshops. Piscataway:IEEE Press, 2022: 1-6.
[15]	ZHANG S M , XU X D , SUN M Y ,et al. Joint spectrum and power allocation in 5G integrated access and backhaul networks at mmWave band[C]// Proceedings of 2020 IEEE 31st Annual International Symposium on Personal,Indoor and Mobile Radio Communications. Piscataway:IEEE Press, 2020: 1-7.
[16]	PAGIN M , ZUGNO T , POLESE M ,et al. Resource management for 5G NR integrated access and backhaul:a semi-centralized approach[J]. IEEE Transactions on Wireless Communications, 2022,21(2): 753-767.
[17]	WANG X M , ZHANG Y H , SHEN R J ,et al. DRL-based energy-efficient resource allocation frameworks for uplink NOMA systems[J]. IEEE Internet of Things Journal, 2020,7(8): 7279-7294.
[18]	喻鹏, 张俊也, 李文璟 ,等. 移动边缘网络中基于双深度 Q 学习的高能效资源分配方法[J]. 通信学报, 2020,41(12): 148-161.
	YU P , ZHANG J Y , LI W J ,et al. Energy-efficient resource allocation method in mobile edge network based on double deep Q-learning[J]. Journal on Communications, 2020,41(12): 148-161.
[19]	ZHU X Y , WANG J , LI J M ,et al. A scheme for uplink NOMA communication with intelligent resource allocation for mMTC traffic over eMBB traffic[C]// Proceedings of 2022 IEEE 95th Vehicular Technology Conference. Piscataway:IEEE Press, 2022: 1-5.
[20]	徐思雅, 邢逸斐, 郭少勇 ,等. 基于深度强化学习的能源互联网智能巡检任务分配机制[J]. 通信学报, 2021,42(5): 191-204.
	XU S Y , XING Y F , GUO S Y ,et al. Deep reinforcement learning based task allocation mechanism for intelligent inspection in energy Internet[J]. Journal on Communications, 2021,42(5): 191-204.
[21]	LEI W L , YE Y , XIAO M . Deep reinforcement learning-based spectrum allocation in integrated access and backhaul networks[J]. IEEE Transactions on Cognitive Communications and Networking, 2020,6(3): 970-979.
[22]	CHENG Q Q , WEI Z Q , YUAN J H . Deep reinforcement learning-based spectrum allocation and power management for IAB networks[C]// Proceedings of 2021 IEEE International Conference on Communications Workshops (ICC Workshops). Piscataway:IEEE Press, 2021: 1-6.
[23]	CHENG Z P , MIN M H , GAO Z B ,et al. Joint task offloading and resource allocation for mobile edge computing in ultra-dense network[C]// Proceedings of 2021 IEEE Global Communications Conference. Piscataway:IEEE Press, 2021: 1-6.
[24]	MENG F , CHEN P , WU L N ,et al. Power allocation in multi-user cellular networks:deep reinforcement learning approaches[J]. IEEE Transactions on Wireless Communications, 2020,19(10): 6255-6267.
[25]	周凡, 王鸿, 宋荣方 . 密集异构蜂窝网络中基于深度强化学习的下行链路功率分配算法[J]. 南京邮电大学学报(自然科学版), 2021,41(2): 12-19.
	ZHOU F , WANG H , SONG R F . Deep reinforcement learning based downlink power allocation algorithm in dense heterogeneous cellular networks[J]. Journal of Nanjing University of Posts and Telecommunications (Natural Science), 2021,41(2): 12-19.
[26]	KIM Y , LIM H . Multi-agent reinforcement learning-based resource management for end-to-end network slicing[J]. IEEE Access, 2021,9: 56178-56190.
[27]	GUO D L , TANG L , ZHANG X G ,et al. Joint optimization of handover control and power allocation based on multi-agent deep reinforcement learning[J]. IEEE Transactions on Vehicular Technology, 2020,69(11): 13124-13138.
[28]	YU C , VELU A , VINITSKY E ,et al. The surprising effectiveness of ppo in cooperative multi-agent games[J]. Advances in Neural Information Processing Systems, 2022,35: 24611-24624.
[29]	LOWE R , WU Y , TAMAR A ,et al. Multi-agent actor-critic for mixed cooperative-competitive environments[J]. arXiv Preprint,arXiv:1706.02275, 2017.
[30]	HOU W J , WEN H , SONG H H ,et al. Multiagent deep reinforcement learning for task offloading and resource allocation in cybertwin-based networks[J]. IEEE Internet of Things Journal, 2021,8(22): 16256-16268.

算法	时间复杂度
平均分配	O(N(L+H)T _AA )
PSO	O(XNMT _PSO )
MAPPO、MADDPG	O(T _step N)

参数	设置值
每个基站信道总带宽/MHz	50
IAB节点路径损耗/dB	28+22lg(d)+20lg(f_c )
电力终端路径损耗/dB	32.4+211g(d)+20lg(f_c )
载波频率 f_c/GHz	2.4
IAB节点发射功率/dBm	33
电力终端发射功率/dBm	20
噪声功率谱密度/ (dBm·Hz^-1)	-174
自干扰/dB	-70
子信道衰落模型	瑞利衰落

智能电网中基于多智能体强化学习的频谱分配算法

Spectrum allocation algorithm based on multi-agent reinforcement learning in smart grid

在线阅读

PDF下载

可视化

摘要/Abstract

引用本文

使用本文

图/表 11

参考文献 30

相关文章 15

Metrics

推荐阅读 0

[1]	王圣宝, 周鑫, 文康, 翁柏森. 适用于智能电网的三方认证密钥交换协议[J]. 通信学报, 2023, 44(2): 210-218.
[2]	许文俊, 吴思雷, 王凤玉, 林兰, 李国军, 张治. 基于多智能体强化学习的大规模灾后用户分布式覆盖优化[J]. 通信学报, 2022, 43(8): 1-16.
[3]	董晓庆,程良伦,郑耿忠,王涛. 认知异构无线网络中传输速率最大化的频谱资源分配方法[J]. 通信学报, 2019, 40(9): 124-135.
[4]	贾敏,敬晓晔,刘晓锋,刘枫,郭庆,顾学迈. 基于业务优先级的认知卫星网络频谱分配方法[J]. 通信学报, 2019, 40(4): 140-148.
[5]	贾亚男,岳殿武. 认知小蜂窝网络中基于能效的下行资源分配算法[J]. 通信学报, 2016, 37(4): 116-127.
[6]	石悦,邱雪松,郭少勇,亓峰. 基于改进遗传算法的电力光传输网规划方法[J]. 通信学报, 2016, 37(1): 116-122.
[7]	张翅,曾碧卿,杨劲松,谢晓虹. OFDMA认知无线电网络中面向需求的频谱共享[J]. 通信学报, 2015, 36(8): 192-206.
[8]	金顺福,葛世英,霍占强. 概率反馈动态频谱分配策略及性能分析[J]. 通信学报, 2015, 36(7): 10-17.
[9]	肖竹,李仁发,易克初,张杰. 两层异构网络中femtocell研究进展与展望[J]. 通信学报, 2013, 34(2): 156-169.
[10]	金顺福,解洪亭,赵媛. 带有组间切换的认知无线网络混合式信道分配策略及性能研究[J]. 通信学报, 2013, 34(12): 11-19.
[11]	石华,李建东,李钊. 认知异构网络中基于克隆选择算法的动态频谱分配[J]. 通信学报, 2012, 33(7): 59-66.
[12]	杜文峰,刘亚涛,明仲,隋银雪. 基于干扰消减的认知无线电频谱分配算法[J]. 通信学报, 2012, 33(5): 106-114.
[13]	张文柱,王凌云. 基于单频段多赢家拍卖的动态频谱分配[J]. 通信学报, 2012, 33(2): 1-6.
[14]	赵亮,金梁,黄开枝,杨梅樾. 异构无线网络中的共享载波垂直网络转换算法[J]. 通信学报, 2012, 33(1): 79-88.
[15]	孙杰,郭伟,唐伟. 认知无线多跳网中保证信干噪比的频谱分配算法[J]. 通信学报, 2011, 32(11): 111-117.