通信学报 ›› 2016, Vol. 37 ›› Issue (1): 151-159.doi: 10.11959/j.issn.1000-436x.2016019

• 学术论文 • 上一篇    下一篇

面向微博的多实体稀疏关系数据联合聚类

于淼,杨武,王巍,申国伟   

  1. 哈尔滨工程大学信息安全研究中心,黑龙江 哈尔滨 150001
  • 出版日期:2016-01-25 发布日期:2016-01-27
  • 基金资助:
    国家自然科学基金资助项目

Co-clustering of multi-entities sparse relational data in microblogging

Miao YU,Wu YANG,Wei WANG,wei SHENGuo   

  1. Information Security Research Center, Harbin Engineering Un versity, Harbin 150001, China
  • Online:2016-01-25 Published:2016-01-27
  • Supported by:
    The National High Technology Research and Development ogram of China (863 Program)

摘要:

针对大规模微博中多实体间的稀疏关系数据,提出一种面向多实体稀疏关系数据的高效联合聚类算法。在算法中,为了充分利用多关系数据,提出了一种顽健的约束信息嵌入方法构建关系矩阵,降低了矩阵的稀疏性,进一步提高了算法的准确率。在稀疏约束的块坐标下降框架下,关系矩阵通过非负矩阵三分解算法同时获得不同实体的聚类指示矩阵。非负矩阵分解过程中,通过高效的投射算法实现快速求解,确保了聚类结果的稀疏结构。在人工和真实数据集上的实验表明,算法在 个指标上都具有明显提高,特别是在极端稀疏数据上的效果更加明显。3

关键词: 微博, 多实体稀疏关系, 联合聚类, 非负矩阵分解, 辅助信息嵌入

Abstract:

For large-scale sparse relation data of multi-entity in microblogging, an efficient co-clustering algorithm was proposed which processed sparse relation data of multi-entity. In order to take full advantage of multi-relational data when using this algorithm, a robust constraint information embedding algorithm was proposed to construct relation ma-trix, and the performance of relation mining was improved by reducing matrix sparsity. In the sparse constraint block coordinate descent framework, relation matrix concurrently obtained cluster indication matrix of different entities by non-negative matrix tri-factorization. In non-negative matrix factorization, to ensure sparse structure of clustering result, a quick solution was achieved through efficient projection algorithm. Experiments on synthetic and real data sets show that proposed algorithm goes beyond all the baselines on three indicators. The improvement is more significant especially when processing extremely sparse data.

Key words: microblogging, multi-entity sparse relation, co-clustering, non-negative matrix factorization, auxiliary in-formation embedding

No Suggested Reading articles found!