Journal on Communications ›› 2022, Vol. 43 ›› Issue (10): 186-195.doi: 10.11959/j.issn.1000-436x.2022193

• Papers • Previous Articles     Next Articles

Outlier detection algorithm based on fast density peak clustering outlier factor

Zhongping ZHANG1,2,3, Sen LI1, Weixiong LIU1, Shuxia LIU4   

  1. 1 College of Information Science and Engineering, Yanshan University, Qinhuangdao 066004, China
    2 The Key Laboratory for Computer Virtual Technology and System Integration of Hebei Province, Qinhuangdao 066004, China
    3 The Key Laboratory of Software Engineering of Hebei Province, Qinhuangdao 066004, China
    4 Hebei Normal University of Science and Technology, Qinhuangdao 066004, China
  • Revised:2022-07-08 Online:2022-10-25 Published:2022-10-01
  • Supported by:
    The National Natural Science Foundation of China(61972334);The National Social Science Foundation of China(20BJ122);Hebei Province Innovation Capability Improvement Plan Project(20557640D);The Intelligent Image Workpiece Recognition of Sida Railway(x2021134)

Abstract:

For the problem that peak density clustering algorithm requires human set parameters and high time complexity, an outlier detection algorithm based on fast density peak clustering outlier factor was proposed.Firstly, k nearest neighbors algorithm was used to replace the density peak of density estimate, which adopted the KD-Tree index data structure calculation of k close neighbors of data objects, and then the way of the product of density and distance was adopted to automatic selection of clustering centers.In addition, the centripetal relative distance and fast density peak clustering outliers were defined to describe the degree of outliers of data objects.Experiments on artificial data sets and real data sets were carried out to verify the algorithm, and compared with some classical and novel algorithms.The validity and time efficiency of the proposed algorithm are verified.

Key words: data mining, density peak clustering, outlier, k nearest neighbor, centripetal relative distance

CLC Number: 

No Suggested Reading articles found!