通信学报 ›› 2013, Vol. 34 ›› Issue (Z2): 74-78.doi: 10.3969/j.issn.1000-436x.2013.Z2.015

• 新一代校园网 • 上一篇    下一篇

基于SVM过滤的微博新闻话题检测方法

程俊霞1,李芝棠1,2,邹明光1,肖津1   

  1. 1 华中科技大学 计算机学院,湖北 武汉 430074
    2 下一代互联网接入系统国家工程实验室,湖北 武汉 430074
  • 出版日期:2013-12-25 发布日期:2017-06-16

Novel topic detection method for microblog based on SVM filtration

Jun-xia CHENG1,Zhi-tang LI1,2,Ming-guang ZOU1,Jin XIAO1   

  1. 1 School of Computer Science and Technology,Huazhong University of Science and Technology,Wuhan 430074,China
    2 National Engineering Laboratory for Next Gerneration Internet Access System,Wuhan 430074,China
  • Online:2013-12-25 Published:2017-06-16

摘要:

在基于聚类的话题检测方法上提出了一种基于 SVM 过滤的检测方法,该方法在聚类前将微博文本特征抽象成用于输入向量机的向量,对微博文本进行过滤,降低了计算量。井针对微博聚类的长尾现象提出了基于高频词排序的改进单遍聚类方法,能很好地检测孤立点的存在。实验表明,该方法在海量微博数据中能有效地检测出新闻话题。

关键词: 话题检测, 特征向量, SVM

Abstract:

A detection method based on SVM filtration was proposed.The method uses text feature as imported vectors to filtrate microblog news,reducing the amount of calculation greatly.A single-pass clustering algorithm based on the improvement of high-frequency words sorting was proposed,which can detect isolated points commendably.Experimental results show that the method can detect news topics from massive microblog data efficiently.

Key words: topic detecting, characteristic vector, SVM

No Suggested Reading articles found!