通信学报 ›› 2021, Vol. 42 ›› Issue (11): 133-144.doi: 10.11959/j.issn.1000-436x.2021215

• 学术论文 • 上一篇    下一篇

深度学习数据窃取攻击在数据沙箱模式下的威胁分析与防御方法研究

潘鹤中1, 韩培义2, 向夏雨1, 段少明2, 庄荣飞2, 刘川意2,3   

  1. 1 北京邮电大学网络空间安全学院,北京 100876
    2 哈尔滨工业大学(深圳)计算机科学与技术学院,广东 深圳 518055
    3 鹏城实验室网络空间安全中心,广东 深圳 518066
  • 修回日期:2021-08-31 出版日期:2021-11-25 发布日期:2021-11-01
  • 作者简介:潘鹤中(1991− ),男,辽宁本溪人,北京邮电大学博士生,主要研究方向为云安全、数据安全、密码学
    韩培义(1992− ),男,山西吕梁人,博士,哈尔滨工业大学(深圳)助理研究员,主要研究方向为数据安全和隐私保护
    向夏雨(1991− ),男,湖南花垣人,北京邮电大学博士生,主要研究方向为隐私保护、医疗大数据分析
    段少明(1994− ),男,湖南邵阳人,哈尔滨工业大学(深圳)博士生,主要研究方向数据安全和机器学习
    庄荣飞(1992− ),男,福建泉州人,哈尔滨工业大学(深圳)博士生,主要研究方向为数据安全、机器学习安全、隐私保护
    刘川意(1982− ),男,四川乐山人,博士,哈尔滨工业大学(深圳)教授,主要研究方向为云计算与云安全、大规模存储系统、数据保护与数据安全
  • 基金资助:
    国家自然科学基金资助项目(61872110)

Threat analysis and defense methods of deep-learning-based data theft in data sandbox mode

Hezhong PAN1, Peiyi HAN2, Xiayu XIANG1, Shaoming DUAN2, Rongfei ZHUANG2, Chuanyi LIU2,3   

  1. 1 School of Cyberspace Security, Beijing University of Posts and Telecommunications, Beijing 100876, China
    2 School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen), Shenzhen 518055, China
    3 Cyberspace Security Research Center, Peng Cheng Laboratory, Shenzhen 518066, China
  • Revised:2021-08-31 Online:2021-11-25 Published:2021-11-01
  • Supported by:
    The National Natural Science Foundation of China(61872110)

摘要:

详细分析了数据沙箱模式下,深度学习数据窃取攻击的威胁模型,量化评估了数据处理阶段和模型训练阶段攻击的危害程度和鉴别特征。针对数据处理阶段的攻击,提出基于模型剪枝的数据泄露防御方法,在保证原模型可用性的前提下减少数据泄露量;针对模型训练阶段的攻击,提出基于模型参数分析的攻击检测方法,从而拦截恶意模型防止数据泄露。这2种防御方法不需要修改或加密数据,也不需要人工分析深度学习模型训练代码,能够更好地应用于数据沙箱模式下数据窃取防御。实验评估显示,基于模型剪枝的防御方法最高能够减少73%的数据泄露,基于模型参数分析的检测方法能够有效识别95%以上的攻击行为。

关键词: 数据沙箱, 数据窃取, AI安全

Abstract:

The threat model of deep-learning-based data theft in data sandbox model was analyzed in detail, and the degree of damage and distinguishing characteristics of this attack were quantitatively evaluated both in the data processing stage and the model training stage.Aiming at the attack in the data processing stage, a data leakage prevention method based on model pruning was proposed to reduce the amount of data leakage while ensuring the availability of the original model.Aiming at the attack in model training stage, an attack detection method based on model parameter analysis was proposed to intercept malicious models and prevent data leakage.These two methods do not need to modify or encrypt data, and do not need to manually analyze the training code of deep learning model, so they can be better applied to data theft defense in data sandbox mode.Experimental evaluation shows that the defense method based on model pruning can reduce 73% of data leakage, and the detection method based on model parameter analysis can effectively identify more than 95% of attacks.

Key words: data sandbox, data theft, security of AI

中图分类号: 

No Suggested Reading articles found!