通信学报 ›› 2017, Vol. 38 ›› Issue (1): 44-53.doi: 10.11959/j.issn.1000-436x.2017006

• 学术论文 • 上一篇    下一篇

基于贝叶斯模型的微博网络水军识别算法研究

张艳梅1,黄莹莹1,甘世杰1,丁熠2,马志龙3   

  1. 1 中央财经大学信息学院,北京 100081
    2 电子科技大学网络与数据安全四川省重点实验室,四川 成都 610054
    3 新疆财经大学计算机科学与工程学院,新疆 乌鲁木齐 830012
  • 修回日期:2016-09-26 出版日期:2017-01-01 发布日期:2017-01-23
  • 作者简介:张艳梅(1976-),女,吉林省吉林市人,博士,中央财经大学副教授,主要研究方向为智能数据分析和服务计算。|黄莹莹(1995-),女,海南海口人,主要研究方向为智能数据分析。|甘世杰(1997-),男,四川邻水人,主要研究方向为信息安全、智能数据分析。|丁熠(1985-),男,四川宜宾人,博士,电子科技大学副教授,主要研究方向为医学图像处理、模式识别。|马志龙(1983-),男,新疆裕民人,新疆财经大学讲师,主要研究方向为信息安全。
  • 基金资助:
    国家自然科学基金资助项目(61602536);国家自然科学基金资助项目(61273293);国家自然科学基金资助项目(61309029);北京市社会科学重点基金资助项目(16YJA001);网络与数据安全四川省重点实验室开放课题基金资助项目(NDSMS201605);中央财经大学学科建设基金资助项目

Weibo spammers’ identification algorithm based on Bayesian model

Yan-mei ZHANG1,Ying-ying HUANG1,Shi-jie GAN1,Yi DING2,Zhi-long MA3   

  1. 1 Information School,Central University of Finance and Economics,Beijing 100081,China
    2 Network and Data Security Key Laboratory of Sichuan Province,University of Electronic Science and Technology of China,Chengdu 610054,China
    3 Computer Science and Engineering School,Xinjiang University of Finance and Economics,Urumqi 830012,China
  • Revised:2016-09-26 Online:2017-01-01 Published:2017-01-23
  • Supported by:
    The National Natural Science Foundation of China(61602536);The National Natural Science Foundation of China(61273293);The National Natural Science Foundation of China(61309029);Beijing Mu-nicipal Social Science Foundation(16YJA001);The Open Project of Network and Data Security Key Laboratory of Sichuan Province(NDSMS201605);The Discipline Construction Foundation of the Central University of Finance and Economics

摘要:

为了能够有效地识别水军,在以往相关研究基础上,设置粉丝关注比、平均发布微博数、互相关注数、综合质量评价、收藏数和阳光信用这6个特征属性来设计微博水军识别分类器,并基于贝叶斯模型和遗传智能优化算法实现了水军识别算法。利用新浪微博真实数据对算法性能进行了验证,实验结果表明,提出的贝叶斯水军识别算法能够在不牺牲非水军识别率的情况下,保证水军识别的准确率,而且提出的阈值优化算法能显著提升水军识别的准确率。

关键词: 网络水军, 水军识别, 微博, 贝叶斯模型, 遗传算法

Abstract:

In order to distinguish the spammers efficiently,a classifier based on the behavior characteristics was established.By analyzing the previous research,the ratio of followers,total number of blog posts,the number of friends,comprehensive quality evaluation and favorites according to latest data set,the Weibo spammers’ identification algorithm was realized based on Bayesian model and genetic algorithm.The experiment result based on the real-time data of Sina Weibo verify that the Bayesian model recognition algorithm can ensure spammers recognition accuracy without sacrificing recognition rate of non-spammers,and the proposed threshold value matrix proposed optimization can significantly improve recognition accuracy navy.

Key words: network spammer, spammer identification, Weibo, Bayesian model, genetic algorithm

中图分类号: 

No Suggested Reading articles found!