通信学报
• 学术论文 • 上一篇 下一篇
程俊霞1,李芝棠1,2,邹明光1,肖津1
出版日期:
发布日期:
Online:
Published:
摘要: 在基于聚类的话题检测方法上提出了一种基于SVM过滤的检测方法,该方法在聚类前将微博文本特征抽象成用于输入向量机的向量,对微博文本进行过滤,降低了计算量。并针对微博聚类的长尾现象提出了基于高频词排序的改进单遍聚类方法,能很好地检测孤立点的存在。实验表明,该方法在海量微博数据中能有效地检测出新闻话题。
Abstract: A detection method based on SVM filtration was proposed. The method uses text feature as imported vectors to filtrate microblog news, reducing the amount of calculation greatly. A single-pass clustering algorithm based on the improvement of high-frequency words sorting was proposed, which can detect isolated points commendably. Experimental results show that the method can detect news topics from massive microblog data efficiently.
程俊霞1,李芝棠1,2,邹明光1,肖津1. 基于SVM过滤的微博新闻话题检测方法[J]. 通信学报.
0 / / 推荐
导出引用管理器 EndNote|Reference Manager|ProCite|BibTeX|RefWorks
链接本文: https://www.infocomm-journal.com/txxb/CN/
https://www.infocomm-journal.com/txxb/CN/Y2013/V34/IZ2/15