Journal on Communications ›› 2017, Vol. 38 ›› Issue (3): 101-111.doi: 10.11959/j.issn.1000-436x.2017061

• Papers • Previous Articles     Next Articles

Uncertain data analysis algorithm based on fast Gaussian transform

Rong-hua CHI1,Yuan CHENG2,3,Su-xia ZHU2,Shao-bin HUANG1,De-yun CHEN2,3   

  1. 1 College of Computer Science and Technology,Harbin Engineering University,Harbin 150001,China
    2 College of Computer Science and Technology,Harbin University of Science and Technology,Harbin 150080,China
    3 Postdoctoral Research Station,College of Computer Science and Technology,Harbin University of Science and Technology,Harbin 150080,China
  • Revised:2016-09-15 Online:2017-03-01 Published:2017-04-13
  • Supported by:
    The National Natural Science Foundation of China for Youths(61502123);Heilongjiang Province Science Foundation for Youths(QC2015084);Heilongjiang Province Support Program for Youth Academic Backbones in Regular Institutions of Higher Education(1253G017)

Abstract:

The effect of the uncertainties needs to be taken full advantage during uncertain data clustering.An uncertain data clustering algorithm based on fast Gaussian transform was proposed,to solve the problems about the impact on the accuracy of clustering results and the clustering efficiency caused by the uncertainties,during the construction of uncertain data models and the distance measurement,which existed in the current researches.First,the data model according to the characteristic of the uncertainty distribution was constructed,without the premise of assuming the data distribution.And the similarity between uncertain data objects was measured by combining the two important features of uncertain objects,attribute features and the probability density function representing the characteristic of uncertainty distribution.And then the uncertain data clustering algorithm was proposed.Finally,the experiment results on UCI and real datasets indicate the better efficiency and accuracy of proposed algorithm.

Key words: clustering analysis, uncertain data, probability density function, fast Gaussian transform, kernel density estimation

CLC Number: 

No Suggested Reading articles found!