Telecommunications Science ›› 2018, Vol. 34 ›› Issue (4): 156-161.doi: 10.11959/j.issn.1000-0801.2018133

• Power informatization column • Previous Articles     Next Articles

Implementation of a big data anonymization system based on Spark

Chaoyi BIAN1,2,Shaomin ZHU1,Tao ZHOU1   

  1. 1 Beijing Venus Information Security Technology Incorporated Company,Beijing 100193,China
    2 Beijing University of Posts and Telecommunications,Beijing 100876,China
  • Revised:2018-02-10 Online:2018-04-01 Published:2018-05-02


Group based anonymization is a classical data anonymization framework,which achieves the effect of privacy protection by constructing groups of anonymized data records ensuring that records in the same group cannot be distinguished with each other.The electric power industry big data analysis involves the core data of the power enterprises and the user privacy data,the data sensitivity is stronger,traditional data anonymization systems are unable to meet the needs of big data business applications and safety protection of electric power industry.A new big data anonymization system based on Spark was designed and implemented,which could provide the support for multiple data formats stored on Hadoop and substantially improve the processing efficiency for big data.

Key words: data anonymization, privacy, electric power industry, safety protection, Spark

CLC Number: 

No Suggested Reading articles found!