Journal on Communications ›› 2021, Vol. 42 ›› Issue (9): 133-143.doi: 10.11959/j.issn.1000-436x.2021152

• Papers • Previous Articles     Next Articles

ERDOF: outlier detection algorithm based on entropy weight distance and relative density outlier factor

Zhongping ZHANG1,2,3, Weixiong LIU1, Yuting ZHANG1, Yu DENG1, Mianxin WEI1   

  1. 1 College of Information Science and Engineering, Yanshan University, Qinhuangdao 066004, China
    2 The Key Laboratory for Computer Virtual Technology and System Integration of Hebei Province, Qinhuangdao 066004, China
    3 The Key Laboratory of Software Engineering of Hebei Province, Qinhuangdao 066004, China
  • Revised:2021-06-30 Online:2021-09-25 Published:2021-09-01
  • Supported by:
    Hebei Province Innovation Capability Improvement Plan Project(20557640D)

Abstract:

An outlier detection algorithm based on entropy weight distance and relative density outlier factor was proposed to solve the problem of low accuracy in complex data distribution and high dimensional data sets.Firstly, entropy weight distance was introduced instead of euclidean distance to improve the detection accuracy of outliers.Then, the Gaussian kernel density estimation was carried out for the data object based on the concept of natural neighbor.At the same time, relative distance was proposed to describe the degree of the data object deviating from the neighborhood and improve the ability of the algorithm to detect outliers in the low-density region.Finally, the entropy weight distance and relative density outlier factor were proposed to describe the degree of outliers.Experiments with artificial data sets and real data sets show that the proposed algorithm can effectively adapt to various data distributions and outlier detection of high-dimensional data.

Key words: data mining, outlier detection, information entropy, kernel density estimation

CLC Number: 

No Suggested Reading articles found!