网络与信息安全学报 ›› 2022, Vol. 8 ›› Issue (5): 56-65.doi: 10.11959/j.issn.2096-109x.2022069

• 专题:大数据与人工智能安全 • 上一篇    下一篇

动态聚合权重的隐私保护联邦学习框架

应作斌1, 方一晨1, 张怡文2   

  1. 1 澳门城市大学,中国 澳门 999078
    2 安徽新华学院,安徽 合肥 230000
  • 修回日期:2022-09-06 出版日期:2022-10-15 发布日期:2022-10-01
  • 作者简介:应作斌(1982- ),男,安徽芜湖人,澳门城市大学助理教授,主要研究方向为区块链、联邦学习
    方一晨(1998- ),女,浙江湖州人,澳门城市大学硕士生,主要研究方向为差分隐私、联邦学习
    张怡文(1980- ),女,安徽阜阳人,安徽新华学院教授,主要研究方向为数据挖掘、联邦学习
  • 基金资助:
    澳门科学技术发展基金(0038/2022/A)

Privacy-preserving federated learning framework with dynamic weight aggregation

Zuobin YING1, Yichen FANG1, Yiwen ZHANG2   

  1. 1 City University of Macau, Macau 999078, China
    2 Anhui Xinhua University, Hefei 230000, China
  • Revised:2022-09-06 Online:2022-10-15 Published:2022-10-01
  • Supported by:
    General R&D Subsidy Program Fund Macau(0038/2022/A)

摘要:

在非可信中心服务器下的隐私保护联邦学习框架中,存在以下两个问题。① 在中心服务器上聚合分布式学习模型时使用固定的权重,通常是每个参与方的数据集大小。然而,不同参与方具有非独立同分布的数据,设置固定聚合权重会使全局模型的效用无法达到最优。② 现有框架建立在中心服务器是诚实的假定下,没有考虑中央服务器不可信导致的参与方的数据隐私泄露问题。为了解决上述问题,基于比较流行的DP-FedAvg算法,提出了一种非可信中心服务器下的动态聚合权重的隐私保护联邦学习DP-DFL框架,其设定了一种动态的模型聚合权重,该方法从不同参与方的数据中直接学习联邦学习中的模型聚合权重,从而适用于非独立同分布的数据环境。此外,在本地模型隐私保护阶段注入噪声进行模型参数的隐私保护,满足不可信中心服务器的设定,从而降低本地参与方模型参数上传中的隐私泄露风险。在数据集CIFAR-10 上的实验证明,DP-DFL 框架不仅提供本地隐私保证,同时可以实现更高的准确率,相较DP-FedAvg算法模型的平均准确率提高了2.09%。

关键词: 联邦学习, 差分隐私, 动态聚合权重, 非独立同分布数据

Abstract:

There are two problems with the privacy-preserving federal learning framework under an unreliable central server.① A fixed weight, typically the size of each participant’s dataset, is used when aggregating distributed learning models on the central server.However, different participants have non-independent and homogeneously distributed data, then setting fixed aggregation weights would prevent the global model from achieving optimal utility.② Existing frameworks are built on the assumption that the central server is honest, and do not consider the problem of data privacy leakage of participants due to the untrustworthiness of the central server.To address the above issues, based on the popular DP-FedAvg algorithm, a privacy-preserving federated learning DP-DFL algorithm for dynamic weight aggregation under a non-trusted central server was proposed which set a dynamic model aggregation weight.The proposed algorithm learned the model aggregation weight in federated learning directly from the data of different participants, and thus it is applicable to non-independent homogeneously distributed data environment.In addition, the privacy of model parameters was protected using noise in the local model privacy protection phase, which satisfied the untrustworthy central server setting and thus reduced the risk of privacy leakage in the upload of model parameters from local participants.Experiments on dataset CIFAR-10 demonstrate that the DP-DFL algorithm not only provides local privacy guarantees, but also achieves higher accuracy rates with an average accuracy improvement of 2.09% compared to the DP-FedAvg algorithm models.

Key words: federated learning, differential privacy, dynamic aggregation weight, non-independent and identically distributed data

中图分类号: 

No Suggested Reading articles found!