基于DRL的联邦学习节点选择方法

doi:10.11959/j.issn.1000-436x.2021111

Abstract

Abstract:

To cope with the impact of different device computing capabilities and non-independent uniformly distributed data on federated learning performance, and to efficiently schedule terminal devices to complete model aggregation, a method of node selection based on deep reinforcement learning was proposed.It considered training quality and efficiency of heterogeneous terminal devices, and filtrate malicious nodes to guarantee higher model accuracy and shorter training delay of federated learning.Firstly, according to characteristics of model distributed training in federated learning, a node selection system model based on deep reinforcement learning was constructed.Secondly, considering such factors as device training delay, model transmission delay and accuracy, an optimization model of accuracy for node selection was proposed.Finally, the problem model was constructed as a Markov decision process and a node selection algorithm based on distributed proximal strategy optimization was designed to obtain a reasonable set of devices before each training iteration to complete model aggregation.Simulation results demonstrate that the proposed method significantly improves the accuracy and training speed of federated learning, and its convergence and robustness are also well.

Key words: federated learning, model aggregation, node selection, deep reinforcement learning, accuracy

CLC Number:

TP911.1

Wenchen HE, Shaoyong GUO, Xuesong QIU, Liandong CHEN, Suxiang ZHANG. Node selection method in federated learning based on deep reinforcement learning[J]. Journal on Communications, 2021, 42(6): 62-71.

Figures/Tables 12

参数	含义
Z	MEC服务器集合
H_i	FL任务i相关的总数据集
H_{z ,d}	MEC服务器z 覆盖的终端d 的数据集
$\| H_{i} \|$	FL任务i相关的总数据集大小
Ω_i	FL任务i的属性集合
$M_{i}^{0}$	FL任务i的初始模型
C_i	FL任务i计算一组数据所需的CPU周期数
$l_{z, d}^{i}$	设备d 在进行FL任务i训练时的损失函数
$ω_{z, d}^{n}$	设备d 在进行第n次训练时的模型参数
A_i	FL任务i测试数据集的损失函数之和
$r_{i}^{d}$	FL任务i在设备d 与微基站间的传输速率
$r_{i}^{z}$	FL任务i在微基站z 与汇聚服务器间的传输速率
$r_{i}^{z}$	设备d 执行FL任务时的CPU频率
$t_{i}^{tra}$	FL任务i的总传输时延
$t_{_{i}}^{com}$	FL任务i在设备上的计算时延

References 25

[1]	ZHOU Z , CHEN X , LI E ,et al. Edge intelligence:paving the last mile of artificial intelligence with edge computing[J]. Proceedings of the IEEE, 2019,107(8): 1738-1762.
[2]	LU Y L , HUANG X H , DAI Y Y ,et al. Federated learning for data privacy preservation in vehicular cyber-physical systems[J]. IEEE Network, 2020,34(3): 50-56.
[3]	陈兵, 成翔, 张佳乐 ,等. 联邦学习安全与隐私保护综述[J]. 南京航空航天大学学报, 2020,52(5): 675-684.
	CHEN B , CHENG X , ZHANG J L ,et al. Survey of security and privacy in federated learning[J]. Journal of Nanjing University of Aeronautics ＆ Astronautics, 2020,52(5): 675-684.
[4]	WANG H , KAPLAN Z , NIU D ,et al. Optimizing federated learning on non-IID data with reinforcement learning[C]// IEEE INFOCOM 2020 - IEEE Conference on Computer Communications. Piscataway:IEEE Press, 2020: 1698-1707.
[5]	ABDULRAHMAN S , TOUT H ， OULD-SLIMANE H ,et al. A survey on federated learning:the journey from centralized to distributed on-site learning and beyond[J]. IEEE Internet of Things Journal, 2021,8(7): 5476-5497.
[6]	SHI W Q , ZHOU S , NIU Z S . Device scheduling with fast convergence for wireless federated learning[C]// 2020 IEEE International Conference on Communications. Piscataway:IEEE Press, 2020: 1-6.
[7]	REN J K , HE Y H , WEN D Z ,et al. Scheduling for cellular federated edge learning with importance and channel awareness[J]. IEEE Transactions on Wireless Communications, 2020,19(11): 7690-7703.
[8]	CHEN M Z , POOR H V , SAAD W ,et al. Convergence time minimization of federated learning over wireless networks[C]// 2020 IEEE International Conference on Communications. Piscataway:IEEE Press, 2020: 1-6.
[9]	WU W T , HE L G , LIN W W ,et al. Accelerating federated learning over reliability-agnostic clients in mobile edge computing systems[J]. IEEE Transactions on Parallel and Distributed Systems, 2021,32(7): 1539-1551.
[10]	KANG J W , XIONG Z H , NIYATO D ,et al. Incentive mechanism for reliable federated learning:a joint optimization approach to combining reputation and contract theory[J]. IEEE Internet of Things Journal, 2019,6(6): 10700-10714.
[11]	LU Y L , HUANG X H , ZHANG K ,et al. Blockchain empowered asynchronous federated learning for secure data sharing in Internet of vehicles[J]. IEEE Transactions on Vehicular Technology, 2020,69(4): 4298-4311.
[12]	YOSHIDA N , NISHIO T , MORIKURA M ,et al. Hybrid-FL for wireless networks:cooperative learning mechanism using non-IID data[C]// 2020 IEEE International Conference on Communications. Piscataway:IEEE Press, 2020: 1-7.
[13]	YANG Z H , CHEN M Z , SAAD W ,et al. Energy efficient federated learning over wireless communication networks[J]. IEEE Transactions on Wireless Communications, 2021,20(3): 1935-1949.
[14]	ZENG T C , SEMIARI O , MOZAFFARI M ,et al. Federated learning in the sky:joint power allocation and scheduling with UAV swarms[C]// 2020 IEEE International Conference on Communications. Piscataway:IEEE Press, 2020: 1-6.
[15]	TRAN H V , KADDOUM G , ELGALA H ,et al. Lightwave power transfer for federated learning-based wireless networks[J]. IEEE Communications Letters, 2020,24(7): 1472-1476.
[16]	LUO S Q , CHEN X , WU Q ,et al. HFEL:joint edge association and resource allocation for cost-efficient hierarchical federated edge learning[J]. IEEE Transactions on Wireless Communications, 2020,19(10): 6535-6548.
[17]	孟洛明, 孙康, 韦磊 ,等. 一种面向电力无线专网的虚拟资源优化分配机制[J]. 电子与信息学报, 2017,39(7): 1711-1718.
	MENG L M , SUN K , WEI L ,et al. Optimal resource allocation mechanism for electric power wireless virtual networks[J]. Journal of Electronics ＆ Information Technology, 2017,39(7): 1711-1718.
[18]	李枝灵, 刘柱, 郭少勇 ,等. 基于免疫算法的电力线通信网接入点规划方法[J]. 北京邮电大学学报, 2016,39(S1): 104-108.
	LI Z L , LIU Z , GUO S Y ,et al. Access points location planning based on immune algorithm for power line communication network[J]. Journal of Beijing University of Posts and Telecommunications, 2016,39(S1): 104-108.
[19]	赵海涛, 张唐伟, 陈跃 ,等. 基于DQN的车载边缘网络任务分发卸载算法[J]. 通信学报, 2020,41(10): 172-178.
	ZHAO H T , ZHANG T W , CHEN Y ,et al. Task distribution offloading algorithm of vehicle edge network based on DQN[J]. Journal on Communications, 2020,41(10): 172-178.
[20]	喻鹏, 张俊也, 李文璟 ,等. 移动边缘网络中基于双深度 Q 学习的高能效资源分配方法[J]. 通信学报, 2020,41(12): 148-161.
	YU P , ZHANG J Y , LI W J ,et al. Energy-efficient resource allocation method in mobile edge network based on double deep Q-learning[J]. Journal on Communications, 2020,41(12): 148-161.
[21]	PAN S L , ZHANG Z Y , ZHANG Z W ,et al. Dependency-aware computation offloading in mobile edge computing:a reinforcement learning approach[J]. IEEE Access, 2019,7: 134742-134753.
[22]	YANG Z Y , MERRICK K , JIN L W ,et al. Hierarchical deep reinforcement learning for continuous action control[J]. IEEE Transactions on Neural Networks and Learning Systems, 2018,29(11): 5174-5184.
[23]	LIANG X Y , DU X S , WANG G L ,et al. A deep reinforcement learning network for traffic light cycle control[J]. IEEE Transactions on Vehicular Technology, 2019,68(2): 1243-1253.
[24]	LIU C F , BENNIS M , DEBBAH M ,et al. Dynamic task offloading and resource allocation for ultra-reliable low-latency edge computing[J]. IEEE Transactions on Communications, 2019,67(6): 4132-4150.
[25]	SCHULMAN J , LEVINE S , MORITZ P ,et al. Trust region policy optimization[J]. arXiv Preprint,arXiv:1502.05477, 2015.

Metrics

Recommended 0

No Suggested Reading articles found!

参数类型	参数描述	设置
	MEC数	10
	终端数/MEC	80
	终端核数	[10%, 100%]
	本地数据集	[100, 2 000]
网络与模型参数	本地迭代（MNIST/CIFAR）	5
	卷积层（MNIST/CIFAR）	2/5
	全连接层（MNIST/CIFAR）	4/3
	节点不训练概率	[80%,100%]
	独立同分布数据比例	[80%,100%]
	代理数Agents	4
	训练步数	1 000
	Actor α	0.000 1
	Critic α	0.000 2
DPPO参数	奖励折扣因子σ	0.9
	限制步长ε	0.2
	策略更新步数circle	100
	最小样本数 Batch-size	64

Node selection method in federated learning based on deep reinforcement learning

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 12

References 25

Related Articles 15

Metrics

Recommended 0

[1]	Xindi MA, Qinghua LI, Qi JIANG, Zhuo MA, Sheng GAO, Youliang TIAN, Jianfeng MA. Byzantine-robust federated learning over Non-IID data [J]. Journal on Communications, 2023, 44(6): 138-153.
[2]	Biao JIN, Yikang LI, Zhiqiang YAO, Yulin CHEN, Jinbo XIONG. GenFedRL: a general federated reinforcement learning framework for deep reinforcement learning agents [J]. Journal on Communications, 2023, 44(6): 183-197.
[3]	Youliang TIAN, Shihong WU, Ta LI, Lindong WANG, Hua ZHOU. Federated learning optimization algorithm based on incentive mechanism [J]. Journal on Communications, 2023, 44(5): 169-180.
[4]	Yuancheng LI, Yongtai QIN. Deep reinforcement learning based algorithm for real-time QoS optimization of software-defined security middle platform [J]. Journal on Communications, 2023, 44(5): 181-192.
[5]	Jiale ZHANG, Chengcheng ZHU, Xiaobing SUN, Bing CHEN. Membership inference attack and defense method in federated learning based on GAN [J]. Journal on Communications, 2023, 44(5): 193-205.
[6]	Kaiju LI, Qiang XU, Hao WANG. Communication-efficient federated learning method via redundant data elimination [J]. Journal on Communications, 2023, 44(5): 79-93.
[7]	Shengxing YU, Zekai CHEN, Zhong CHEN, Ximeng LIU. DAGUARD: distributed backdoor attack defense scheme under federated learning [J]. Journal on Communications, 2023, 44(5): 110-122.
[8]	Hui JIANG, Tianliu HE, Min LIU, Sheng SUN, Yuwei WANG. High-performance federated continual learning algorithm for heterogeneous streaming data [J]. Journal on Communications, 2023, 44(5): 123-136.
[9]	Guoliang XU, Feng TAN, Yongyi RAN, Feng CHEN. Joint beam hopping and coverage control optimization algorithm for multibeam satellite system [J]. Journal on Communications, 2023, 44(4): 78-86.
[10]	Shengxing YU, Zhong CHEN. Efficient secure federated learning aggregation framework based on homomorphic encryption [J]. Journal on Communications, 2023, 44(1): 14-28.
[11]	Lingtao TANG, Di WANG, Shengyun LIU. Data augmentation scheme for federated learning with non-IID data [J]. Journal on Communications, 2023, 44(1): 164-176.
[12]	Zongxuan SHA, Ru HUO, Chuang SUN, Shuo WANG, Tao HUANG. Forwarding efficiency aware traffic scheduling algorithm based on deep reinforcement learning [J]. Journal on Communications, 2022, 43(8): 30-40.
[13]	Shaoshuai FAN, Jianbo WU, Hui TIAN. Federated learning resource management for energy-constrained industrial IoT devices [J]. Journal on Communications, 2022, 43(8): 65-77.
[14]	Yu ZHANG, Min CHENG. Joint optimization of edge computing and caching in NDN [J]. Journal on Communications, 2022, 43(8): 164-175.
[15]	Zijia MO, Zhipeng GAO, Yang YANG, Yijing LIN, Shan SUN, Chen ZHAO. Efficient distributed model sharing strategy for data privacy protection in Internet of vehicles [J]. Journal on Communications, 2022, 43(4): 83-94.