网络与信息安全学报 ›› 2023, Vol. 9 ›› Issue (6): 127-139.doi: 10.11959/j.issn.2096-109x.2023088

• 学术论文 • 上一篇    

面向GDPR隐私政策合规性的智能化检测方法

李昕1, 唐鹏1, 张西珩1, 邱卫东1, 回红2   

  1. 1 上海交通大学网络空间安全学院,上海 200240
    2 上海交通大学网络安全技术研究院,上海 200240
  • 修回日期:2023-07-04 出版日期:2023-12-01 发布日期:2023-12-01
  • 作者简介:李昕(1999- ),男,江苏宿迁人,上海交通大学硕士生,主要研究方向为自然语言处理、隐私保护
    唐鹏(1992- ),男,江西抚州人,上海交通大学博士生,主要研究方向为人工智能安全、隐私保护
    张西珩(1999- ),男,山东聊城人,上海交通大学硕士生,主要研究方向为联邦学习、隐私保护
    邱卫东(1973- ),男,江西九江人,博士,上海交通大学教授、博士生导师,主要研究方向为密码分析/密码工程、人工智能安全、大数据隐私保护
    回红(1969- ),女,天津人,博士,上海交通大学副教授,主要研究方向为图像处理与模式识别、信息安全
  • 基金资助:
    国家自然科学基金(61972249);国家重点研发计划(2023YFB3106500)

GDPR-oriented intelligent checking method of privacy policies compliance

Xin LI1, Peng TANG1, Xiheng ZHANG1, Weidong QIU1, Hong HUI2   

  1. 1 School of Cyberspace Security, Shanghai Jiao Tong University, Shanghai 200240, China
    2 Institute of Cyber Science and Technology, Shanghai Jiao Tong University, Shanghai 200240, China
  • Revised:2023-07-04 Online:2023-12-01 Published:2023-12-01
  • Supported by:
    The National Natural Science Foundation of China(61972249);The National Key R&D Program of China(2023YFB3106500)

摘要:

欧盟《通用数据保护条例(GDPR,general data protection regulation)》自2018年施行以来,已开出罚单300多起,其中不乏谷歌这类知名企业未能提供透明易懂的隐私政策而遭受巨额处罚。这项严格的数据保护法律使得各国企业在提供跨境服务特别是向欧盟地区提供服务时变得尤为谨慎。同时其管辖范围规定,GDPR适用于任何为欧盟公民提供服务的企业,无论其是否在欧盟境内注册,这意味着世界各地涉及海外业务的企业都要考虑其隐私政策面向 GDPR 的合规性,国内企业也不例外。面向这一需求,构建了一套智能化检测方法,自动提取各在线服务企业的隐私政策,并采用机器学习和自动化技术,将其转化为具有结构层次的标准格式。之后进行基于自然语言处理的文本分类,识别其中涵盖的相应的 GDPR 概念,并以搭建的GDPR知识图谱为依据,检验隐私政策是否缺少部分GDPR要求披露的概念,从而实现面向GDPR的隐私政策合规性智能化检测,为国内企业向欧盟用户提供跨境服务提供支撑。对语料库中样本的分析结果进一步揭示了主流在线服务企业普遍未达到GDPR合规要求的现状。

关键词: 通用数据保护条例, 隐私政策, 层级结构, 合规性检测

Abstract:

The implementation of the EU’s General Data Protection Regulation (GDPR) has resulted in the imposition of over 300 fines since its inception in 2018.These fines include significant penalties for prominent companies like Google, which were penalized for their failure to provide transparent and comprehensible privacy policies.The GDPR, known as the strictest data protection laws in history, has made companies worldwide more cautious when offering cross-border services, particularly to the European Union.The regulation's territorial scope stipulates that it applies to any company providing services to EU citizens, irrespective of their location.This implies that companies worldwide, including domestic enterprises, are required to ensure compliance with GDPR in their privacy policies, especially those involved in international operations.To meet this requirement, an intelligent detection method was introduced.Machine learning and automation technologies were utilized to automatically extract privacy policies from online service companies.The policies were converted into a standardized format with a hierarchical structure.Through natural language processing, the privacy policies were classified, allowing for the identification of relevant GDPR concepts.In addition, a constructed GDPR taxonomy was used in the detection mechanism to identify any missing concepts as required by GDPR.This approach facilitated intelligent detection of GDPR-oriented privacy policy compliance, providing support to domestic enterprises while they provided cross-border services to EU users.Analysis of the corpus samples reveals the current situation that mainstream online service companies generally fail to meet GDPR compliance requirements.

Key words: GDPR, privacy policy, hierarchical structure, compliance checking

中图分类号: 

No Suggested Reading articles found!