Chinese Journal of Network and Information Security ›› 2023, Vol. 9 ›› Issue (2): 1-20.doi: 10.11959/j.issn.2096-109x.2023017

• Comprehensive Reviews •     Next Articles

Survey on vertical federated learning: algorithm, privacy and security

Jinyin CHEN1,2, Rongchang LI2, Guohan HUANG2, Tao LIU2, Haibin ZHENG1, Yao CHENG3   

  1. 1 Institute of Cyberspace Security, Zhejiang University of Technology, Hangzhou 310023, China
    2 College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
    3 TüV SüD Asia Pacific Pte.Ltd., 60993, Singapore
  • Revised:2022-08-20 Online:2023-04-25 Published:2023-04-01
  • Supported by:
    The National Natural Science Foundation of China(62072406);Zhejiang Provincial Natural Science Foundation(DQ23F020001);The National Key Laboratory of Science and Technology on Information System Security(61421110502);The National Key R&D Program of China(2018AAA0100801)

Abstract:

Federated learning (FL) is a distributed machine learning technology that enables joint construction of machine learning models by transmitting intermediate results (e.g., model parameters, parameter gradients, embedding representation, etc.) applied to data distributed across various institutions.FL reduces the risk of privacy leakage, since raw data is not allowed to leave the institution.According to the difference in data distribution between institutions, FL is usually divided into horizontal federated learning (HFL), vertical federated learning (VFL), and federal transfer learning (TFL).VFL is suitable for scenarios where institutions have the same sample space but different feature spaces and is widely used in fields such as medical diagnosis, financial and security of VFL.Although VFL performs well in real-world applications, it still faces many privacy and security challenges.To the best of our knowledge, no comprehensive survey has been conducted on privacy and security methods.The existing VFL was analyzed from four perspectives: the basic framework, communication mechanism, alignment mechanism, and label processing mechanism.Then the privacy and security risks faced by VFL and the related defense methods were introduced and analyzed.Additionally, the common data sets and indicators suitable for VFL and platform framework were presented.Considering the existing challenges and problems, the future direction and development trend of VFL were outlined, to provide a reference for the theoretical research of building an efficient, robust and safe VFL.

Key words: vertical federated learning, security and privacy, backdoor attack, inference attack and defense, adversarial attack, security evaluation

CLC Number: 

No Suggested Reading articles found!