大数据 ›› 2022, Vol. 8 ›› Issue (5): 12-32.doi: 10.11959/j.issn.2096-0271.2022038

• 专题:数据流通与隐私计算 • 上一篇    下一篇

联邦学习攻击与防御综述

吴建汉1,2, 司世景1, 王健宗1, 肖京1   

  1. 1 平安科技(深圳)有限公司,广东 深圳 518063
    2 中国科学技术大学,安徽 合肥 230026
  • 出版日期:2022-09-15 发布日期:2022-09-01
  • 作者简介:吴建汉(1998- ),男,中国科学技术大学硕士生,平安科技(深圳)有限公司算法工程师,中国计算机学会(CCF)学生会员,主要研究方向为计算机视觉和联邦学习
    司世景(1988- ),男,博士,平安科技(深圳)有限公司资深算法研究员,CCF会员,主要研究方向为机器学习及其在人工智能领域的应用
    王健宗(1983- ),男,博士,平安科技(深圳)有限公司副总工程师,资深人工智能总监,联邦学习技术部总经理,CCF高级会员,CCF大数据专家委员会委员,主要研究方向为联邦学习和人工智能等
    肖京(1972- ),男,博士,平安科技(深圳)有限公司首席科学家,2019年吴文俊人工智能杰出贡献奖获得者,CCF深圳会员活动中心副主席,主要研究方向为计算机图形学学科、自动驾驶、3D显示、医疗诊断、联邦学习等
  • 基金资助:
    广东省重点领域研发计划“新一代人工智能”重大专项(2021B0101400003)

Threats and defenses of federated learning: a survey

Jianhan WU1,2, Shijing SI1, Jianzong WANG1, Jing XIAO1   

  1. 1 Ping An Technology (Shenzhen) Co., Ltd., Shenzhen 518063, China
    2 University of Science and Technology of China, Hefei 230026, China
  • Online:2022-09-15 Published:2022-09-01
  • Supported by:
    The Key Research and Development Program of Guangdong Province(2021B0101400003)

摘要:

随着机器学习技术的广泛应用,数据安全问题时有发生,人们对数据隐私保护的需求日渐显现,这无疑降低了不同实体间共享数据的可能性,导致数据难以共享,形成“数据孤岛”。联邦学习可以有效解决“数据孤岛”问题。联邦学习本质上是一种分布式的机器学习,其最大的特点是将用户数据保存在用户本地,模型联合训练过程中不会泄露各参与方的原始数据。尽管如此,联邦学习在实际应用中仍然存在许多安全隐患,需要深入研究。对联邦学习可能受到的攻击及相应的防御措施进行系统性的梳理。首先根据联邦学习的训练环节对其可能受到的攻击和威胁进行分类,列举各个类别的攻击方法,并介绍相应攻击的攻击原理;然后针对这些攻击和威胁总结具体的防御措施,并进行原理分析,以期为初次接触这一领域的研究人员提供详实的参考;最后对该研究领域的未来工作进行展望,指出几个需要重点关注的方向,帮助提高联邦学习的安全性。

关键词: 联邦学习, 攻击, 防御, 隐私保护, 机器学习

Abstract:

With the comprehensive application of machine learning technology, data security problems occur from time to time, and people’s demand for privacy protection is emerging, which undoubtedly reduces the possibility of data sharing between different entities, making it difficult to make full use of data and giving rise to data islands.Federated learning (FL), as an effective method to solve the problem of data islands, is essentially distributed machine learning.Its biggest characteristic is to save user data locally so that the models’ joint training process won’t leak sensitive data of partners.Nevertheless, there are still many security risks in federated learning in reality, which need to be further studied.The possible attack means and corresponding defense measures were investigated in federal learning comprehensively and systematically.Firstly, the possible attacks and threats were classified according to the training stages of federal learning, common attack methods of each category were enumerated, and the attack principle of corresponding attacks was introduced.Then the specific defense measures against these attacks and threats were summarized along with the principle analysis, to provide a detailed reference for the researchers who first contact this field.Finally, the future work in this research area was highlighted, and several areas that need to be focused on were pointed out to help improve the security of federal learning.

Key words: federated learning, attack, defense, privacy protection, machine learning

中图分类号: 

No Suggested Reading articles found!