冗余数据去除的联邦学习高效通信方法

doi:10.11959/j.issn.1000-436x.2023072

通信学报 ›› 2023, Vol. 44 ›› Issue (5): 79-93.doi: 10.11959/j.issn.1000-436x.2023072

冗余数据去除的联邦学习高效通信方法

李开菊¹^,², 许强³, 王豪¹^,⁴

¹ 重庆邮电大学计算机科学与技术学院，重庆 400065
² 重庆大学计算机学院，重庆 400044
³ 香港城市大学电机工程系，香港 999077
⁴ 旅游多源数据感知与决策技术文化和旅游部重点实验室，重庆 400065

修回日期:2023-02-04 出版日期:2023-05-25 发布日期:2023-05-01
作者简介:李开菊（1992- ），女，土家族，湖北恩施人，重庆大学博士生，主要研究方向为联邦学习、隐私保护
许强（1992- ），男，江西赣州人，博士，香港城市大学在站博士后，主要研究方向为视频安全、图像处理等
王豪（1990- ），男，河南驻马店人，博士，重庆邮电大学副教授，主要研究方向为联邦学习、隐私保护
基金资助:
国家自然科学基金资助项目(42001398);重庆市自然科学基金资助项目(cstc2020jcyj-msxmX0635);重庆市博士后研究项目特别资助项目(2021XM3009);中国博士后基金资助项目(2021M693929)

Communication-efficient federated learning method via redundant data elimination

Kaiju LI¹^,², Qiang XU³, Hao WANG¹^,⁴

¹ School of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
² College of Computer Science, Chongqing University, Chongqing 400044, China
³ Department of Electrical Engineering, City University of Hong Kong, Hong Kong 999077, China
⁴ Key Laboratory of Tourism Multisource Data Perception and Decision, Ministry of Culture and Tourism, Chongqing University of Posts and Telecommunications, Chongqing 400065, China

Revised:2023-02-04 Online:2023-05-25 Published:2023-05-01
Supported by:
The National Natural Science Foundation of China(42001398);The Natural Science Foundation of Chongqing(cstc2020jcyj-msxmX0635);Chongqing Postdoctoral Research Program Special Funding(2021XM3009);China Postdoctoral Foundation(2021M693929)

摘要/Abstract

摘要：

为了应对终端设备网络带宽受限对联邦学习通信效率的影响，高效地传输本地模型更新以完成模型聚合，提出了一种冗余数据去除的联邦学习高效通信方法。该方法通过分析冗余更新参数产生的本质原因，根据联邦学习中数据非独立同分布特性和模型分布式训练特点，给出新的核心数据集敏感度和损失函数容忍度定义，提出联邦核心数据集构建算法。此外，为了适配所提取的核心数据，设计了分布式自适应模型演化机制，在每次训练迭代前动态调整训练模型的结构和大小，在减少终端与云服务器通信比特数传输的同时，保证了训练模型的准确率。仿真实验表明，与目前最优的方法相比，所提方法减少了17%的通信比特数，且只有0.5%的模型准确率降低。

关键词: 联邦学习, 通信效率, 核心数据, 模型演化, 准确率

Abstract:

To address the influence of limited network bandwidth of edge devices on the communication efficiency of federated learning, and efficiently transmit local model update to complete model aggregation, a communication-efficient federated learning method via redundant data elimination was proposed.The essential reasons for generation of redundant update parameters and according to non-IID properties and model distributed training features of FL were analyzed, a novel sensitivity and loss function tolerance definitions for coreset was given, and a novel federated coreset construction algorithm was proposed.Furthermore, to fit the extracted coreset, a novel distributed adaptive sparse network model evolution mechanism was designed to dynamically adjust the structure and the training model size before each global training iteration, which reduced the number of communication bits between edge devices and the server while also guarantees the training model accuracy.Experimental results show that the proposed method achieves 17% reduction in communication bits transmission while only 0.5% degradation in model accuracy compared with state-of-the-art method.

Key words: federated learning, communication efficiency, coreset, model evolution, accuracy

中图分类号:

TP391

李开菊, 许强, 王豪. 冗余数据去除的联邦学习高效通信方法[J]. 通信学报, 2023, 44(5): 79-93.

Kaiju LI, Qiang XU, Hao WANG. Communication-efficient federated learning method via redundant data elimination[J]. Journal on Communications, 2023, 44(5): 79-93.

图/表 7

表1

图1

表2

图2

图3

图4

图5

参考文献 31

[1]	RODRIGUES T K , SUTO K , KATO N . Edge cloud server deployment with transmission power control through machine learning for 6G Internet of things[J]. IEEE Transactions on Emerging Topics in Computing, 2021,9(4): 2099-2108.
[2]	MCMAHAN B , MOORE E , RAMAGE D ,et al. Communication-efficient learning of deep networks from decentralized data[C]// Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS). Piscataway:IEEE Press, 2017: 1273-1282.
[3]	LI K J , XIAO C H . CBFL:a communication-efficient federated learning framework from data redundancy perspective[J]. IEEE Systems Journal, 2022,16(4): 5572-5583.
[4]	SATTLER F , WIEDEMANN S , MüLLER K R ,et al. Robust and communication-efficient federated learning from non-i.i.d.data[J]. IEEE Transactions on Neural Networks and Learning Systems, 2019,31(9): 3400-3413.
[5]	LI K J , XIAO C H . PBFL:communication-efficient federated learning via parameter predicting[J]. The Computer Journal, 2023,66(3): 626-642.
[6]	KONECNY J , MCMAHAN H B , YU F X ,et al. Federated learning:strategies for improving communication efficiency[J]. arXiv Preprint,arXiv:1610.05492, 2016.
[7]	LI T , SAHU A K , TALWALKAR A ,et al. Federated learning:challenges,methods,and future directions[J]. IEEE Signal Processing Magazine, 2020,37(3): 50-60.
[8]	LI K J , XIAO C H . Federated learning communication-efficiency framework via corset construction[J]. Computer Journal, 2022,doi:10.1093/comjnl/bxac062.
[9]	TELLEZ D , LITJENS G , LAAK J ,et al. Neural image compression for gigapixel histopathology image analysis[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021,43(2): 567-578.
[10]	LI X , HUANG K X , YANG W H ,et al. On the convergence of FedAvg on Non-IID data[C]// Proceedings of the International Conference on Learning Representations (ICLR). Piscataway:IEEE Press, 2020: 1-26.
[11]	TAO Z , LI Q . eSGD:communication efficient distributed deep learning on the edge[C]// 2018 Hot Topics in Edge Computing (HotEdge 18). Piscataway:IEEE Press, 2018: 1-6.
[12]	YU H , YANG S , ZHU S H . Parallel restarted SGD with faster convergence and less communication:demystifying why model averaging works for deep learning[C]// Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto:AAAI Press, 2019: 5693-5700.
[13]	XU J J , DU W L , JIN Y C ,et al. Ternary compression for communication-efficient federated learning[J]. IEEE Transactions on Neural Networks and Learning Systems, 2022,33(3): 1162-1176.
[14]	LI S Q , QI Q , WANG J Y ,et al. GGS:general gradient sparsification for federated learning in edge computing[C]// Proceedings of 2020 IEEE International Conference on Communications (ICC). Piscataway:IEEE Press, 2020: 1-7.
[15]	HAN P C , WANG S Q , LEUNG K K . Adaptive gradient sparsification for efficient federated learning:an online learning approach[C]// Proceedings of 2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS). Piscataway:IEEE Press, 2021: 300-310.
[16]	OZFATURA E , OZFATURA K , GüNDüZ D . Time-correlated sparsification for communication-efficient federated learning[C]// Proceedings of 2021 IEEE International Symposium on Information Theory (ISIT). Piscataway:IEEE Press, 2021: 461-466.
[17]	ASAD M , MOUSTAFA A , ITO T . FedOpt:towards communication efficiency and privacy preservation in federated learning[J]. Applied Sciences, 2020,10(8): 2864.
[18]	BERNSTEIN J , WANG Y , AZIZZADENESHELI K ,et al. signSGD:compressed optimisation for non-convex problems[C]// Proceedings of the 35th International Conference on Machine Learning. Piscataway:IEEE Press, 2018: 560-569.
[19]	REISIZADEH A , MOKHTARI A , HASSANI H ,et al. FedPAQ:a communication-efficient federated learning method with periodic averaging and quantization[C]// Proceedings of the 23rdInternational Conference on Artificial Intelligence and Statistics (AISTATS). Piscataway:IEEE Press, 2020: 2021-2031.
[20]	AMIRI M M , GUNDUZ D , KULKARNI S R ,et al. Federated learning with quantized global model updates[J]. arXiv Preprint,arXiv:2020.10672, 2020.
[21]	NORI M K , YUN S , KIM I M . Fast federated learning by balancing communication trade-offs[J]. IEEE Transactions on Communications, 2021,69(8): 5168-5182.
[22]	JHUNJHUNWALA D , GADHIKAR A , JOSHI G ,et al. Adaptive quantization of model updates for communication-efficient federated learning[C]// Proceedings of 2021 IEEE International Conference on Acoustics,Speech and Signal Processing (ICASSP). Piscataway:IEEE Press, 2021: 3110-3114.
[23]	DU Y , YANG S , HUANG H . High-dimensional stochastic gradient quantization for communication-efficient edge learning[C]// Proceed ings of 2019 IEEE Global Conference on Signal and Information Processing (Global SIP). Piscataway:IEEE Press, 2019: 1-5.
[24]	LIAN Z , CAO Z , ZUO Y ,et al. AGQFL:communication-efficient federated learning via automatic gradient quantization in edge heterogeneous systems[C]// Proceedings of 2021 IEEE 39th International Conference on Computer Design (ICCD). Piscataway:IEEE Press, 2021: 551-558.
[25]	BāDOIU M , HAR-PELED S , INDYK P . Approximate clustering via core-sets[C]// Proceedings of the 34th Annual ACM Symposium on Theory of Computing. New York:ACM Press, 2002: 250-257.
[26]	VLADIMIR B , DAN F , HARRY L ,et al. Efficient coreset constructions via sensitivity sampling[C]// Proceedings of the 13th Asian Con ference on Machine Learning. Piscataway:IEEE Press, 2021: 948-963.
[27]	LU H L , LI M J , HE T ,et al. Robust coreset construction for distributed machine learning[J]. IEEE Journal on Selected Areas in Communications, 2020,38(10): 2400-2417.
[28]	CAMPBELL T , BRODERICK T . Bayesian coreset construction via greedy iterative geodesic ascent[C]// Proceedings of the 35th International Conference on Machine Learning. Piscataway:IEEE Press, 2018: 698-706.
[29]	FAN Y W , LI H S . Communication efficient coreset sampling for distributed learning[C]// Proceedings of 2018 IEEE 19th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC). Piscataway:IEEE Press, 2018: 1-5.
[30]	MOCANU D C , MOCANU E , STONE P ,et al. Scalable training of artificial neural networks with adaptive sparse connectivity inspired by network science[J]. Nature Communications, 2018,9:2383.
[31]	LI A , SUN J , WANG B ,et al. LotteryFL:personalized and communication-efficient federated learning with lottery ticket hypothesis on non-IID datasets[J]. arXiv Preprint,arXiv:2008.03371, 2020.

符号	含义
	全局加权平均损失函数
	设备N_i第t轮迭代的本地损失函数
	设备N_i第t轮迭代的梯度值
	设备N_i的学习率
	设备N_i第t轮迭代的本地模型更新
	第t轮迭代的全局模型参数
	第T轮迭代的全局模型参数
D_i	设备的本地数据集
D	全体样本空间
N	终端设备数目
ε	损失函数容忍度
Γ	原始数据集X的核心数据集
A_T	全局模型的收敛准确率
B_T	累计通信比特数

参数	取值
N	100
r	1
B	10
E	5
η	0.01
ε	0.05
σ	0.08
β	{100,50,30}
ζ	{0,0.1,0.5,1.0}
?	{0,0.1,0.5,1.0}
(λ,K)	{(5,16), (4,8), (3,6), (2,4), (1,2)}
τ	0.01

冗余数据去除的联邦学习高效通信方法

Communication-efficient federated learning method via redundant data elimination

在线阅读

PDF下载

可视化

摘要/Abstract

引用本文

使用本文

图/表 7

参考文献 31

相关文章 15

Metrics

推荐阅读 0

[1]	马鑫迪, 李清华, 姜奇, 马卓, 高胜, 田有亮, 马建峰. 面向Non-IID数据的拜占庭鲁棒联邦学习[J]. 通信学报, 2023, 44(6): 138-153.
[2]	金彪, 李逸康, 姚志强, 陈瑜霖, 熊金波. GenFedRL：面向深度强化学习智能体的通用联邦强化学习框架[J]. 通信学报, 2023, 44(6): 183-197.
[3]	田有亮, 吴柿红, 李沓, 王林冬, 周骅. 基于激励机制的联邦学习优化算法[J]. 通信学报, 2023, 44(5): 169-180.
[4]	张佳乐, 朱诚诚, 孙小兵, 陈兵. 基于GAN的联邦学习成员推理攻击与防御方法[J]. 通信学报, 2023, 44(5): 193-205.
[5]	余晟兴, 陈泽凯, 陈钟, 刘西蒙. DAGUARD：联邦学习下的分布式后门攻击防御方案[J]. 通信学报, 2023, 44(5): 110-122.
[6]	姜慧, 何天流, 刘敏, 孙胜, 王煜炜. 面向异构流式数据的高性能联邦持续学习算法[J]. 通信学报, 2023, 44(5): 123-136.
[7]	余晟兴, 陈钟. 基于同态加密的高效安全联邦学习聚合框架[J]. 通信学报, 2023, 44(1): 14-28.
[8]	汤凌韬, 王迪, 刘盛云. 面向非独立同分布数据的联邦学习数据增强方案[J]. 通信学报, 2023, 44(1): 164-176.
[9]	范绍帅, 吴剑波, 田辉. 面向能量受限工业物联网设备的联邦学习资源管理[J]. 通信学报, 2022, 43(8): 65-77.
[10]	莫梓嘉, 高志鹏, 杨杨, 林怡静, 孙山, 赵晨. 面向车联网数据隐私保护的高效分布式模型共享策略[J]. 通信学报, 2022, 43(4): 83-94.
[11]	康海燕, 冀源蕊. 基于本地化差分隐私的联邦学习方法研究[J]. 通信学报, 2022, 43(10): 94-105.
[12]	陶梅霞, 王栋, 孙瑞, 张乃夫. 联邦学习中基于时分多址接入的用户调度策略[J]. 通信学报, 2021, 42(6): 1-29.
[13]	贺文晨, 郭少勇, 邱雪松, 陈连栋, 张素香. 基于DRL的联邦学习节点选择方法[J]. 通信学报, 2021, 42(6): 62-71.
[14]	李尤慧子, 殷昱煜, 高洪皓, 金一, 王新珩. 面向隐私保护的非聚合式数据共享综述[J]. 通信学报, 2021, 42(6): 195-212.
[15]	黄永明, 郑冲, 张征明, 尤肖虎. 大规模无线通信网络移动边缘计算和缓存研究[J]. 通信学报, 2021, 42(4): 44-61.