网络与信息安全学报 ›› 2023, Vol. 9 ›› Issue (2): 1-20.doi: 10.11959/j.issn.2096-109x.2023017

• 综述 •    下一篇

纵向联邦学习方法及其隐私和安全综述

陈晋音1,2, 李荣昌2, 黄国瀚2, 刘涛2, 郑海斌1, 程瑶3   

  1. 1 浙江工业大学网络空间安全研究院,浙江 杭州 310023
    2 浙江工业大学信息工程学院,浙江 杭州 310023
    3 南德认证检测亚太有限公司,新加坡 60993
  • 修回日期:2022-08-20 出版日期:2023-04-25 发布日期:2023-04-01
  • 作者简介:陈晋音(1982- ),女,浙江象山人,浙江工业大学教授,主要研究方向为人工智能安全、图数据挖掘和进化计算
    李荣昌(1998- ),男,浙江长兴人,浙江工业大学硕士生,主要研究方向为人工智能安全、图数据挖掘和联邦学习
    黄国瀚(1997- ),男,浙江台州人,浙江工业大学硕士生,主要研究方向为图神经网络及纵向联邦学习安全
    刘涛(1998- ),男,浙江绍兴人,浙江工业大学硕士生,主要研究方向为人工智能安全和联邦学习
    郑海斌(1995- ),男,浙江台州人,浙江工业大学讲师,主要研究方向为深度学习、人工智能安全和图像识别
    程瑶(1987- ),女,新加坡,博士,南德认证检测亚太有限公司研究员,主要研究方向为深度学习系统安全与隐私、区块链技术应用和安卓框架脆弱性分析
  • 基金资助:
    国家自然科学基金(62072406);浙江省自然科学基金(DQ23F020001);信息系统安全技术重点实验室基金(61421110502);国家重点研发计划(2018AAA0100801)

Survey on vertical federated learning: algorithm, privacy and security

Jinyin CHEN1,2, Rongchang LI2, Guohan HUANG2, Tao LIU2, Haibin ZHENG1, Yao CHENG3   

  1. 1 Institute of Cyberspace Security, Zhejiang University of Technology, Hangzhou 310023, China
    2 College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
    3 TüV SüD Asia Pacific Pte.Ltd., 60993, Singapore
  • Revised:2022-08-20 Online:2023-04-25 Published:2023-04-01
  • Supported by:
    The National Natural Science Foundation of China(62072406);Zhejiang Provincial Natural Science Foundation(DQ23F020001);The National Key Laboratory of Science and Technology on Information System Security(61421110502);The National Key R&D Program of China(2018AAA0100801)

摘要:

联邦学习(FL,federated learning)是一种新兴的分布式机器学习技术,利用分散在各个机构的数据,通过传输中间结果(如模型参数、参数梯度、嵌入信息等)实现机器学习模型的联合构建。联邦学习中机构的训练数据不允许离开本地,因此降低了数据泄露的风险。根据机构之间数据分布的差异,FL 通常分为横向联邦学习(HFL,horizontal FL)、纵向联邦学习(VFL,vertical FL),以及联邦迁移学习(TFL, transfer FL)。其中,VFL适用于机构具有相同样本空间但不同特征空间的场景,广泛应用于医疗诊断、金融评估和教育服务等领域。尽管 VFL 在现实应用中有出色的表现,但其本身仍然面临诸多隐私和安全问题,尚缺少对VFL方法与安全性展开全面综述的工作。为了构建高效且安全的VFL系统,从VFL方法及其隐私和安全两个方面展开,首先从边缘模型、通信机制、对齐机制以及标签处理机制4个角度对现有的VFL方法进行详细总结和归纳;其次介绍并分析了 VFL 面临的隐私和安全风险;进一步对其防御方法进行介绍和总结;此外,介绍了适用于VFL的常见数据集及平台框架。结合VFL面临的安全性挑战给出了VFL的未来研究方向,旨在为构建高效、鲁棒和安全的VFL的理论研究提供参考。

关键词: 纵向联邦学习, 安全与隐私, 后门攻击, 推断攻击与防御, 对抗攻击, 安全性评估

Abstract:

Federated learning (FL) is a distributed machine learning technology that enables joint construction of machine learning models by transmitting intermediate results (e.g., model parameters, parameter gradients, embedding representation, etc.) applied to data distributed across various institutions.FL reduces the risk of privacy leakage, since raw data is not allowed to leave the institution.According to the difference in data distribution between institutions, FL is usually divided into horizontal federated learning (HFL), vertical federated learning (VFL), and federal transfer learning (TFL).VFL is suitable for scenarios where institutions have the same sample space but different feature spaces and is widely used in fields such as medical diagnosis, financial and security of VFL.Although VFL performs well in real-world applications, it still faces many privacy and security challenges.To the best of our knowledge, no comprehensive survey has been conducted on privacy and security methods.The existing VFL was analyzed from four perspectives: the basic framework, communication mechanism, alignment mechanism, and label processing mechanism.Then the privacy and security risks faced by VFL and the related defense methods were introduced and analyzed.Additionally, the common data sets and indicators suitable for VFL and platform framework were presented.Considering the existing challenges and problems, the future direction and development trend of VFL were outlined, to provide a reference for the theoretical research of building an efficient, robust and safe VFL.

Key words: vertical federated learning, security and privacy, backdoor attack, inference attack and defense, adversarial attack, security evaluation

中图分类号: 

No Suggested Reading articles found!