大数据

• •    

基于社交网络大数据的民众情感监测研究

李爱黎1,张子帅1,林荫2,王秋菊2,杨建安1,孟炜程1,张岩峰1   

  1. 1 东北大学计算机科学与工程学院 沈阳市 110000;

    2 东北大学外国语学院 沈阳市 110000

  • 作者简介:李爱黎,出生于1995年,东北大学研究生,主要研究方向为情感分析与数据挖掘。 张子帅,出生于2000年,东北大学本科,主要研究方向为数据挖掘、机器学习。 林荫,出生于1984年,硕士,讲师,主要研究方向为中日文化比较研究。 王秋菊,出生于1962年,博士,教授,主要研究方向为中日文化比较研究、科技与文化(STC)研究。 杨建安,出生于2002年,东北大学本科,主要研究方向为数据挖掘、机器学习。 孟炜程,出生于2002年,东北大学本科,主要研究方向为数据挖掘、机器学习。 张岩峰,出生于1982年,博士,教授,CCF高级会员,主要研究方向为大数据挖掘、大规模机器学习、分布式系统。

Research on Emotion Monitoring of Chinese-Japanese Public Based on Social Network Big Data

LI  Aili1, ZHANG  Zishuai1, LIN  Yin2, WANG  Qiuju2, YANG  Jianan1, MENG Weicheng1, ZHANG  Yanfeng1   

  1. 1 Department of Computer Science and Engineering, Northeastern University, Shenyang 110000, China

    2 Department of Foreign language, Northeastern University, Shenyang 110000, China

摘要:

近年来,新浪微博、Twitter等社交网络平台逐渐成为反映社会舆情的主要载体之一,为网民发表观点和表达情绪提供了一个便利的平台。一旦突发事件发生,民众借助社交网络平台发布微博、推文等来表达与此相关的态度,这些信息通过社交网络进一步被传播扩散,从而产生一定的社会影响。基于社交网络大数据的舆情监控已经成为新的研究热点,利用各国的社交网络大数据进行民众情感监测,有助于直接掌握国际关系中的民众情感倾向,对我国外交、政治、对外贸易等方面都有很重要的作用。基于此,提出了一种面向中日语料的民众情感监测系统,该系统能够同时分析微博和Twitter等社交平台的中日文数据中包含的情感倾向,并以可视化的形式展现给用户。在情感分析算法上,在BERT模型基础上结合自扩展的中日文情感词典,提出了一个新的情感分类模型—EmoBERT。实验结果表明,相比于原始BERT模型,提出的EmoBERT模型在中文情感分类任务和日文情感分类任务上都取得了很好的效果。其中EmoBERT-C将中文BERT准确率从89.68%提升至92.15%,EmoBERT-J将日文BERT模型准确率从74.73%提升至78.26%。

关键词:

情感分析, 舆情监测, 情感词典, 中日关系, 微博, Twitter

Abstract:

 In recent years, social networking platforms such as Sina Weibo and Twitter have gradually become one of the main carriers for reflecting social public opinion, providing a convenient platform for netizens to express their opinions and express their emotions. Once an emergency occurs, people use social network platforms to publish microblogs, tweets, etc. to express their attitudes related to this. These information are further spread and spread through social networks, thus producing a certain social impact. Public opinion monitoring based on social network big data has become a new research hotspot. The use of social network big data in various countries to monitor people's emotions is helpful to directly grasp people's emotional tendencies in international relations, and has a great impact on my country's diplomacy, politics, foreign trade and other aspects. have important roles. Based on this, this paper proposes a public sentiment monitoring system for Chinese and Japanese data, which can simultaneously analyze the emotional tendencies contained in Chinese and Japanese data on social platforms such as Weibo and Twitter, and display it to users in a visual form. In the sentiment analysis algorithm, based on the BERT model and combined with the self-expanding Chinese and Japanese sentiment dictionary, we propose a new sentiment classification model—EmoBERT. The experimental results show that, compared with the original BERT model, the EmoBERT proposed in this paper has achieved good results on both Chinese sentiment classification tasks and Japanese sentiment classification tasks. Among them, EmoBERT-C increased the accuracy of Chinese BERT from 89.68% to 92.15%, and EmoBERT-J increased the accuracy of Japanese BERT model from 74.73% to 78.26%.

Key words: Sentiment analysis, Public opinion monitor, Sentiment lexicon, Chinese-Japanese relations, Weibo, Twitter

No Suggested Reading articles found!