通信学报 ›› 2018, Vol. 39 ›› Issue (4): 189-198.doi: 10.11959/j.issn.1000-436x.2018056

• 学术通信 • 上一篇    

基于RNN和主题模型的社交网络突发话题发现

石磊,杜军平,梁美玉   

  1. 北京邮电大学智能通信软件与多媒体北京市重点实验室,北京 100876
  • 出版日期:2018-04-01 发布日期:2018-04-29
  • 作者简介:石磊(1986-),男,内蒙古突泉人,北京邮电大学博士生,主要研究方向为人工智能、数据挖掘、社交网络搜索。|杜军平(1963-),女,北京人,博士,北京邮电大学教授、博士生导师,主要研究方向为人工智能和数据挖掘。|梁美玉(1985-),女,山东泰安人,北京邮电大学副教授、硕士生导师,主要研究方向为信息搜索、数据挖掘、智能信息处理和计算机视觉。
  • 基金资助:
    国家自然科学基金资助项目(61320106006);国家自然科学基金资助项目(61532006);国家自然科学基金资助项目(61772083)

Social network bursty topic discovery based on RNN and topic model

Lei SHI,Junping DU,Meiyu LIANG   

  1. Beijing Key Laboratory of Intelligent Telecommunications Software and Multimedia,Beijing University of Posts and Telecommunications,Beijing 100876,China
  • Online:2018-04-01 Published:2018-04-29
  • Supported by:
    The National Natural Science Foundation of China(61320106006);The National Natural Science Foundation of China(61532006);The National Natural Science Foundation of China(61772083)

摘要:

社交网络数据是稀疏和嘈杂的,并伴有大量的无意义话题。传统突发话题发现方法无法解决社交网络短文本稀疏性问题,并需要复杂的后处理过程。为了解决上述问题,提出一种基于循环神经网络(RNN,recurrent neural network)和主题模型的突发话题发现(RTM-SBTD)方法。首先,综合RNN和逆序文档频率(IDF,inverse document frequency)构建权重先验来学习词的关系,同时通过构建词对解决短文本稀疏性问题。其次,模型中引入针板先验(spike and slab)来解耦突发话题分布的稀疏和平滑。最后,引入词的突发性来区分建模普通话题和突发话题,实现突发话题自动发现。实验结果表明与现有的主流突发话题发现方法相比,所提 RTM-SBTD 方法在多种评价指标上优于对比算法。

关键词: 社交网络, 突发话题发现, 主题模型, 循环神经网络

Abstract:

The data is noisy and diverse,with a large number of meaningless topics in social network.The traditional method of bursty topic discovery cannot solve the sparseness problem in social network,and require complicated post-processing.In order to tackle this problem,a bursty topic discovery method based on recurrent neural network and topic model was proposed.Firstly,the weight prior based on RNN and IDF were constructed to learn the relationship between words.At the same time,the word pairs were constructed to solve the sparseness problem.Secondly,the “spike and slab” prior was introduced to decouple the sparsity and smoothness of the bursty topic distribution.Finally,the burstiness of words were leveraged to model the bursty topic and the common topic,and automatically discover the bursty topics.To evaluate the effectiveness of proposed method,the various experiments were conducted.Both qualitative and quantitative evaluations demonstrate that the proposed RTM-SBTD method outperforms favorably against several state-of-the-art methods.

Key words: social network, bursty topic discovery, topic model, RNN

中图分类号: 

No Suggested Reading articles found!