通信学报 ›› 2013, Vol. 34 ›› Issue (3): 14-22.doi: 10.3969/j.issn.1000-436x.2013.03.003

• 学术论文 • 上一篇    下一篇

基于局部密度构造相似矩阵的谱聚类算法

吴健1,崔志明1,时玉杰1,盛胜利2,龚声蓉1   

  1. 1 苏州大学 智能信息处理及应用研究所,江苏 苏州 215006
    2 美国阿肯色中央大学 计算机科学系,阿肯色州 康威 72035-0001
  • 出版日期:2013-03-25 发布日期:2017-07-20
  • 基金资助:
    国家自然科学基金资助项目;国家自然科学基金资助项目;国家自然科学基金资助项目

Local density-based similarity matrix construction for spectral clustering

Jian WU1,Zhi-ming CUI1,Yu-jie SHI1,Sheng-li SHENG2,Sheng-rong GONG1   

  1. 1 The Institute of Intelligent Information Processin lication, Soochow University,Suzhou 215006, China
    2 Department of Computer Science, University of Central Arkansas, Conway 72035-0001, USA
  • Online:2013-03-25 Published:2017-07-20
  • Supported by:
    The National Natural Science Foundation of China;The National Natural Science Foundation of China;The National Natural Science Foundation of China

摘要:

依据样本数据点分布的局部和全局一致性特征,提出了一种基于局部密度构造相似矩阵的谱聚类算法。首先通过分析样本数据点的分布特性给出了局部密度定义,根据样本点的局部密度对样本点集由密到疏排序,并按照设计的连接策略构建无向图;然后以GN算法思想为参考,给出了一种基于边介数的权值矩阵计算方法,经过数据转换得到谱聚类相似矩阵;最后通过第一个极大本征间隙出现的位置来确定类个数,并利用经典聚类方法对特征向量空间中的数据点进行聚类。通过人工仿真数据集和UCI数据集进行测试,实验结果表明本文谱聚类算法具有较好的顽健性。

关键词: 谱聚类, 相似矩阵, 局部密度, 无向图构建, 边介数

Abstract:

According to local and global consistency characterist points'distribution, a spectral cluster-ing algorithm using local density-based similarity matrix construction was proposed. Firstly, by analyzing distribution characteristics of sample data points, the definition of local density was given, sorting operation on sample point set from dense to sparse according to sample points'local density was did, and undirected graph in accordance with the designed connection strategy was constructed; then, on the basis of GN algorithm's thinking, a calculation method of weight matrix using edge betweenness was given, and similarity matrix of spectral clustering via data conversion was got; lastly, the class number by appearing position of the first eigengap maximum was determined, and the classification of sample point set in eigenvector space by means of classical cluster g method was realized. By means of artificial simulative data set and UCI data set to carry out the experimental tests, show that the proposed spectral algorithm has better cluster-ing capability.

Key words: spectral clustering, similarity matrix, local density, undirected graph building, edge betweenness

No Suggested Reading articles found!