Chinese Journal of Network and Information Security ›› 2016, Vol. 2 ›› Issue (5): 21-29.doi: 10.11959/j.issn.2096-109x.2016.00048

• Papers • Previous Articles     Next Articles

Time series and semantics-based chinese microblog topic detection and tracking method

Tie-ming CHEN,Xiao-hao WANG,Wei-wei PANG,Jie JIANG   

  1. College of Computer Science &Technology,Zhejiang University of Technology,Hangzhou 310023,China
  • Revised:2016-04-27 Online:2016-05-15 Published:2020-03-26
  • Supported by:
    The National Natural Science Foundation of China(U1509214);The Natural Science Foundation of Zhejiang Province(LY16F020035)

Abstract:

As a widely used tool in social networks,microblog is definitely with short document,quick broadcasting and topic changeable,which results in big challenging for social topic detection and tracking.A new systematic framework for micro-blog topic detection and tracking was proposed based on the microblog clustering using temporal trend and semantic similarity.Firstly,a feature words selection method for hot topics was presented by defining the temporal frequent words set.Secondly,an initially clustering was conducted depending on the selected temporal frequent words set.As far as the overlaps between initial clusters concerned,an effective overlap elimination algorithm was proposed,by introducing the extended short document semantic membership,to separate any possible overlapped initial clusters.Finally,an aggregated topic clustering method was employed using the cluster semantic similarity matrix.The experiments were at last done on some real-world dataset from Sina microblog.It show that the method for chinese microblog topic detection and tracking can obtain excellent performance and results.

Key words: microblog text, frequent words, feature selection, clustering,topic detection, time series, semantics

CLC Number: 

No Suggested Reading articles found!