机器类通信中集中式与分布式Q学习的资源分配算法研究

doi:10.11959/j.issn.1000-0801.2021244

电信科学 ›› 2021, Vol. 37 ›› Issue (11): 41-50.doi: 10.11959/j.issn.1000-0801.2021244

机器类通信中集中式与分布式Q学习的资源分配算法研究

余云河, 孙君

南京邮电大学通信与信息工程学院，江苏南京 210023

修回日期:2021-10-20 出版日期:2021-11-20 发布日期:2021-11-01
作者简介:余云河（1995− ），男，南京邮电大学通信与信息工程学院硕士生，主要研究方向为大规模机器类通信网络中的资源分配
孙君（1980− ），女，南京邮电大学硕士生导师，主要研究方向为无线网络资源管理
基金资助:
国家自然科学基金资助项目(61771255);中国科学院重点实验室开放课题(20190904)

Research on resource allocation algorithm of centralized and distributed Q-learning in machine communication

Yunhe YU, Jun SUN

College of Telecommunications and Information Engineering, Nanjing University of Posts and Telecommunications, Nanjing 210023, China

Revised:2021-10-20 Online:2021-11-20 Published:2021-11-01
Supported by:
The National Natural Science Foundation of China(61771255);Open Project of Key Laboratory of Chinese Academy of Sciences(20190904)

摘要/Abstract

摘要：

针对海量机器类通信（massive machine type communication，mMTC）场景，以最大化系统吞吐量为目标，且在保证部分机器类通信设备（machine type communication device，MTCD）的服务质量（quality of service，QoS）要求前提下，提出两种基于Q学习的资源分配算法：集中式Q学习算法（team-Q）和分布式Q学习算法（dis-Q）。首先基于余弦相似度（cosine similarity，CS）聚类算法，考虑到MTCD地理位置和多级别QoS要求，构造代表MTCD和数据聚合器（data aggregator，DA）的多维向量，根据向量间CS值完成分组。然后分别利用team-Q学习算法和dis-Q学习算法为MTCD分配资源块（resource block，RB）和功率。吞吐量性能上，team-Q 和 dis-Q 算法相较于动态资源分配算法、贪婪算法分别平均提高了 16%、23%；复杂度性能上，dis-Q算法仅为team-Q算法的25%及以下，收敛速度则提高了近40%。

关键词: 资源分配, 集中式Q学习, 分布式Q学习, 余弦相似度, 多维向量

Abstract:

Under the premise of ensuring partial machine type communication device (MTCD)’s quality of service (QoS) requirements, the resource allocation problem was studied with the goal of maximizing system throughput in the massive machine type communication (mMTC) scenario.Two resource allocation algorithms based on Q-learning were proposed: centralized Q-learning algorithm (team-Q) and distributed Q-learning algorithm (dis-Q).Firstly, taking into account MTCD’s geographic location and multi-level QoS requirements, a clustering algorithm based on cosine similarity (CS) was designed.In the clustering algorithm, multi-dimensional vectors that represent MTCD and data aggregator (DA) were constructed, and MTCDs can be grouped according to the CS value between multi-dimensional vectors.Then in the MTC network, the team-Q learning algorithm and dis-Q learning algorithm were used to allocate resource blocks and power for the MTCD.In terms of throughput performance, team-Q and dis-Q algorithms have an average increase of 16% and 23% compared to the dynamic resource allocation algorithm and the greedy algorithm, respectively.In terms of complexity performance, the dis-Q algorithm is only 25% of team-Q algorithm and even below, the convergence speed is increased by nearly 40%.

Key words: resource allocation, centralized Q-learning, distributed Q-learning, consine similarity, multi-dimensional vector

中图分类号:

TP929.5

余云河, 孙君. 机器类通信中集中式与分布式Q学习的资源分配算法研究[J]. 电信科学, 2021, 37(11): 41-50.

Yunhe YU, Jun SUN. Research on resource allocation algorithm of centralized and distributed Q-learning in machine communication[J]. Telecommunications Science, 2021, 37(11): 41-50.

图/表 6

图1

表1

仿真参数"

参数	取值
仿真场景	单蜂窝圆形小区
小区半径	500 m
$p_{MTCD}^{max}$	23 dBm
噪声功率谱密度	-174 dBm/Hz
路损因子α	3
路损常数G	1 ^-20
慢衰落模型	标准偏差为8 dB的对数正态分布

表1

图2

图3

图4

图5

参考文献 20

[1]	CHEN S Y , MA R F , CHEN H H ,et al. Machine-to-machine communications in ultra-dense networks—A survey[J]. IEEE Communications Surveys ＆ Tutorials, 2017,19(3): 1478-1503.
[2]	钱志鸿, 王义君 . 物联网技术与应用研究[J]. 电子学报, 2012,40(5): 1023-1029.
	QIAN Z H , WANG Y J . IoT technology and application[J]. Acta Electronica Sinica, 2012,40(5): 1023-1029.
[3]	Service-aware transport network:opportunities and chanenges[J]. Proceedings of SPIE - The International Society for Optical Engineering, 2005.
[4]	ZHOU Y Q , TIAN L , LIU L ,et al. Fog computing enabled future mobile communication networks:a convergence of communication and computing[J]. IEEE Communications Magazine, 2019,57(5): 20-27.
[5]	Cisco visual networking index:global mobile data traffic forecast update 2014-2019[EB]. 2014.
[6]	LIANG L , XU L , CAO B ,et al. A cluster-based congestion-mitigating access scheme for massive M2M communications in internet of things[J]. IEEE Internet of Things Journal, 2018,5(3): 2200-2211.
[7]	GHAVIMI F , LU Y W , CHEN H H . Uplink scheduling and power allocation for M2M communications in SC-FDMA-based LTE-A networks with QoS guarantees[J]. IEEE Transactions on Vehicular Technology, 2017,66(7): 6160-6170.
[8]	GAO H , XU X D , HAN S J . Homogeneous clustering algorithm based on average residual energy for energy-efficient MTC networks[C]// Proceedings of 2018 24th Asia-Pacific Conference on Communications (APCC). Piscataway:IEEE Press, 2018: 28-33.
[9]	HUSSAIN F , HUSSAIN R , ANPALAGAN A ,et al. A new block-based reinforcement learning approach for distributed resource allocation in clustered IoT networks[J]. IEEE Transactions on Vehicular Technology, 2020,69(3): 2891-2904.
[10]	XU Y Q , FENG G , LIANG L ,et al. MTC data aggregation for 5G network slicing[C]// Proceedings of 2017 23rd Asia-Pacific Conference on Communications (APCC). Piscataway:IEEE Press, 2017: 1-6.
[11]	王鑫, 邱玲 . H2H与M2M共存场景的准入控制及资源分配[J]. 中国科学院大学学报, 2016,33(3): 427-432.
	WANG X , QIU L . Admission control and resource allocation of H2H ＆ M2M co-existence scenario[J]. Journal of University of Chinese Academy of Sciences, 2016,33(3): 427-432.
[12]	蒋继胜, 朱晓荣 . H2H 与 M2M 共存场景下的上行资源分配算法[J]. 电子学报, 2018,46(5): 1259-1264.
	JIANG J S , ZHU X R . An uplink resource allocation algorithm under the scenario of coexistence of H2H ＆ M2M based on knapsack model[J]. Acta Electronica Sinica, 2018,46(5): 1259-1264.
[13]	SALAM T , REHMAN W U , TAO X F . Cooperative data aggregation and dynamic resource allocation for massive machine type communication[J]. IEEE Access, 2018,6: 4145-4158.
[14]	郭涛, 李有明, 雷鹏 ,等. MIMO 中继系统中一种基于用户QoS的资源分配方法[J]. 电信科学, 2015,31(4): 121-126.
	GUO T , LI Y M , LEI P ,et al. A resource allocation scheme based on user’s QoS in MIMO relay system[J]. Telecommunications Science, 2015,31(4): 121-126.
[15]	张海波, 向煜, 刘开健 ,等. 基于D2D通信的V2X资源分配方案[J]. 北京邮电大学学报, 2017,40(5): 92-97.
	ZHANG H B , XIANG Y , LIU K J ,et al. V2X resource allocation scheme based on D2D communication[J]. Journal of Beijing University of Posts and Telecommunications, 2017,40(5): 92-97.
[16]	刘佳言, 秦鹏, 赵雄文 ,等. 基于容量最大化的 mMTC 场景的资源分配问题研究[J]. 电力信息与通信技术, 2020,18(12): 17-22.
	LIU J Y , QIN P , ZHAO X W ,et al. Research on resource allocation of m MTC scenario based on capacity maximization[J]. Electric Power Information and Communication Technology, 2020,18(12): 17-22.
[17]	SHARMA S K , WANG X B . Toward massive machine type communications in ultra-dense cellular IoT networks:current issues and machine learning-assisted solutions[J]. IEEE Communications Surveys ＆ Tutorials, 2020,22(1): 426-471.
[18]	HUSSAIN F , HASSAN S A , HUSSAIN R ,et al. Machine learning for resource management in cellular and IoT networks:potentials,current solutions,and open challenges[J]. IEEE Communications Surveys ＆ Tutorials, 2020,22(2): 1251-1275.
[19]	NIKOPOUR H , BALIGH H . Sparse code multiple access[C]// Proceedings of 2013 IEEE 24th Annual International Symposium on Personal,Indoor,and Mobile Radio Communications (PIMRC). Piscataway:IEEE Press, 2013: 332-336.
[20]	KAI C H , LI H , XU L ,et al. Joint subcarrier assignment with power allocation for sum rate maximization of D2D communications in wireless cellular networks[J]. IEEE Transactions on Vehicular Technology, 2019,68(5): 4748-4759.

机器类通信中集中式与分布式Q学习的资源分配算法研究

Research on resource allocation algorithm of centralized and distributed Q-learning in machine communication

在线阅读

PDF下载

可视化

摘要/Abstract

引用本文

使用本文

图/表 6

参考文献 20

相关文章 15

Metrics

推荐阅读 0

[1]	康宇, 刘雅琼, 赵彤雨, 寿国础. AI算法在车联网通信与计算中的应用综述[J]. 电信科学, 2023, 39(1): 1-19.
[2]	汪晗, 刁磊, 王梦玲, 荣欣, 李佳珉, 尤肖虎. 工业物联网中URLLC的关键问题分析[J]. 电信科学, 2022, 38(Z1): 77-92.
[3]	贺智敏, 林育哲, 程宇杰, 闫实. 基于无线感知辅助的车联网下行无线资源分配方法[J]. 电信科学, 2022, 38(9): 60-70.
[4]	邹璐珊, 黄晓雯, 杨敬民, 郑艺峰, 张光林, 张文杰. 移动边缘计算中资源分配和定价方法综述[J]. 电信科学, 2022, 38(3): 113-132.
[5]	绳韵, 许晨, 郑光远. 基于NOMA的超密集MEC网络任务卸载和资源分配方案[J]. 电信科学, 2022, 38(2): 35-46.
[6]	丁铖, 陈锦荣, 曹小冬, 王翊. 基于服务质量的层次化结构资源分配算法[J]. 电信科学, 2022, 38(1): 102-111.
[7]	曹靖城, 张继东, 史国杰. 一种使用边缘增强技术提高相似图片检索召回率的方法[J]. 电信科学, 2021, 37(1): 76-84.
[8]	吴柳青,朱晓荣. 基于边-端协同的任务卸载资源分配联合优化算法[J]. 电信科学, 2020, 36(3): 42-52.
[9]	李凯,刘伟,罗贵阳,李静林. 以用户为中心的智能化网络运营服务方法[J]. 电信科学, 2020, 36(2): 101-108.
[10]	贾海宇,陈佳,王铭鑫. 无线接入网络中网络功能虚拟化研究综述[J]. 电信科学, 2019, 35(1): 97-112.
[11]	易冰,陈永丽,赵瑞雪. 毫米波5G网络中D2D通信的资源分配方案[J]. 电信科学, 2019, 35(1): 138-146.
[12]	李学华,沈琛,姚媛媛. 基于跨层策略的60 GHz脉冲多用户通信系统性能分析[J]. 电信科学, 2018, 34(9): 70-77.
[13]	葛维春,罗桓桓,周桂平,王英杰,孔祥余,关璐瑶,宋闯,张民,李青. 基于电力光纤到户的网络切片资源管理方案[J]. 电信科学, 2018, 34(9): 153-159.
[14]	张新苹,王园园,田霖,郝树良. 基于业务类型的集中式接入网基站处理资源分配算法[J]. 电信科学, 2018, 34(8): 109-118.
[15]	王正强,成蕖,樊自甫,万晓榆. 非正交多址系统资源分配研究综述[J]. 电信科学, 2018, 34(8): 136-146.