通信学报 ›› 2016, Vol. 37 ›› Issue (9): 175-182.doi: 10.11959/j.issn.1000-436x.2016169

• 学术通信 • 上一篇    下一篇

DiffPRFs:一种面向随机森林的差分隐私保护算法

穆海蓉,丁丽萍,宋宇宁,卢国庆   

  1. 中国科学院软件研究所基础软件国家工程研究中心,北京 100190
  • 出版日期:2016-09-25 发布日期:2016-09-28
  • 基金资助:
    国家高技术研究发展计划(“863”计划)基金资助项目

DiffPRFs:random forest under differential privacy

Hai-rong MU,Li-ping DING,Yu-ning SONG,Guo-qing LU   

  1. National Engineering Research Center of Fundamental Software,Institute of Software,Chinese Academy of Sciences,Beijing 100190,China
  • Online:2016-09-25 Published:2016-09-28
  • Supported by:
    The National High Technology Research and Development Program of China(863 Program)

摘要:

提出一种基于随机森林的差分隐私保护算法DiffPRFs,在每一棵决策树的构建过程中采用指数机制选择分裂点和分裂属性,并根据拉普拉斯机制添加噪声。在整个算法过程中满足差分隐私保护需求,相对于已有算法,该方法无需对数据进行离散化预处理,消除了多维度大数据离散化预处理对于分类系统性能的消耗,便捷地实现分类并保持了较高的分类准确度。实验结果验证了本算法的有效性以及相较于其他分类算法的优势。

关键词: 差分隐私, 隐私保护, 随机森林, 数据挖掘

Abstract:

A differential privacy algorithm DiffPRFs based on random forests was proposed.Exponential mechanism was used to select split point and split attribute in each decision tree building process,and noise was added according to Laplace mechanism.Differential privacy protection requirement was satisfied through overall process.Compared to existed algorithms,the proposed method does not require pre-discretization of continuous attributes which significantly reduces the performance cost of preprocessing in large multi-dimensional dataset.Classification is achieved conveniently and efficiently while maintains the high accuracy.Experimental results demonstrate the effectiveness and superiority of the algorithm compared to other classification algorithms.

Key words: differential privacy, privacy protection, random forest, data mining

No Suggested Reading articles found!