Chinese Journal of Network and Information Security ›› 2016, Vol. 2 ›› Issue (11): 47-51.doi: 10.11959/j.issn.2096-109x.2016.00087

• Papers • Previous Articles     Next Articles

Clustering algorithm preserving differential privacy in the framework of Spark

Zhi-qiang GAO,Qing-peng LI,Ren-yuan HU   

  1. Department of Information Engineering,University of PAP,Xi’an 710086,China
  • Revised:2016-06-26 Online:2016-11-01 Published:2016-11-15
  • Supported by:
    The National Natural Science Foundation of China(61309008);The Natural Science Foundation of Shaanxi Province(2014JQ8049)

Abstract:

Aimed at the problem that traditional methods fail to deal with malicious attacks with arbitrary background knowledge during the process of massive data clustering analysis,an improved clustering algorithm, especially designed for preserving differential privacy,under the framework of Spark was proposed.Furthermore,it’s theoretically proved to meet the standard of ε-differential privacy in the framework of Spark platform.Finally,experimental results show that guaranteeing the availability of proposed clustering algorithm,the improved algorithm has an advantage over privacy protection and satisfaction in the aspect of time as well as efficiency.Most importantly,the proposed algorithm shows a good application prospect in the analysis of data clustering preserving privacy protection and data security.

Key words: Spark,differential privacy, clustering algorithm, data mining, big data analysis

CLC Number: 

No Suggested Reading articles found!