电信科学 ›› 2020, Vol. 36 ›› Issue (11): 47-60.doi: 10.11959/j.issn.1000-0801.2020302

• 研究与开发 • 上一篇    下一篇

融入词汇共现的社交网络用户情感Biterm主题模型

顾秋阳1,2,吴宝1,2,琚春华3   

  1. 1 浙江工业大学管理学院,浙江 杭州 310023
    2 浙江工业大学中国中小企业研究院,浙江 杭州 310023
    3 浙江工商大学管理工程与电子商务学院,浙江 杭州 310018
  • 修回日期:2020-11-10 出版日期:2020-11-20 发布日期:2020-12-09
  • 作者简介:顾秋阳(1995- ),男,浙江工业大学博士生,主要研究方向为智能信息处理、数据挖掘、电子商务与物流优化等|吴宝(1979- ),男,博士,浙江工业大学教授、博士生导师,主要研究方向为社会网络、企业社会责任与高质量发展等|琚春华(1962- ),男,博士,浙江工商大学教授、博士生导师,主要研究方向为智能信息处理、数据挖掘、电子商务与物流优化等

Biterm topic model of social network users’ sentiment by integrating word co-occurrence

Qiuyang GU1,2,Bao WU1,2,Chunhua JU3   

  1. 1 School of Management,Zhejiang University of Technology,Hangzhou 310023,China
    2 China Institute for Small and Medium Enterprises,Zhejiang University of Technology,Hangzhou 310023,China
    3 School of Management Science &Engineering,Zhejiang Gongshang University,Hangzhou 310018,China
  • Revised:2020-11-10 Online:2020-11-20 Published:2020-12-09

摘要:

近年社交网络用户数量不断增加,基于文本的用户情感分析技术得到普遍关注和应用。但数据稀疏性、精度较低等问题往往会降低情感识别方法的精度和速度,提出了用户情感Biterm主题模型(US-BTM),从特定场所的文本中发现用户偏好及情感倾向,有效利用Biterm进行主题建模,并使用聚合策略形成伪文档,为整个文本集创建词汇配对以解决数据稀疏性和短文本等问题。通过词汇共现算法对主题进行研究,推断文本集级别信息的主题,并通过分析特定场景下的评论文本集中的词汇配对集及其相应主题的情感,达到准确预测用户对特定场景的兴趣、偏好和情感的目的。结果证明,所提方法能准确地捕捉用户的情感倾向,正确地揭示用户偏好,可广泛应用于社交网络的内容描述、推荐及社交网络用户兴趣描述、语义分析等多个领域。

关键词: 词汇共现, 社交网络, 用户情感, Biterm主题模型, 聚合策略

Abstract:

With the increasing number of social network users in recent years,text-based user sentiment analysis technology has been widely concerned and applied.However,data sparsity and low accuracy often reduce the accuracy and speed of emotion recognition methods.The user emotion Biterm topic model (US-BTM) was proposed which could find user preference and emotional tendency from the text of specific places,so as to effectively use Biterm for topic modeling.The strategy of user aggregation to form pseudo-documents was used,and word pairs were created for the whole corpus to solve the problems of data sparsity and short text.Then the topic was studied through the lexical co-occurrence model,so as to infer the topic with abundant corps-level information,and the purpose of accurately predicting the user’s interest,preference and emotion to the specific scene was achieved by analyzing the lexical matching set in the comment corpus under the specific scene and the emotion of the corresponding topic.The experimental results show that the method proposed can accurately capture users’ emotional tendency and correctly reveal users’ preference,which can be widely used in social network content description,recommendation,social network user interest description,semantic analysis and other fields.

Key words: vocabulary co-occurrence, social network, user sentiment, Biterm topic model, aggregation strategy

中图分类号: 

No Suggested Reading articles found!