网络与信息安全学报 ›› 2020, Vol. 6 ›› Issue (6): 164-173.doi: 10.11959/j.issn.2096-109x.2020086

• 学术论文 • 上一篇    

基于用户兴趣的微博溯源算法

杨潇1,2,陈秀真1,2(),马进1,2,梁浩喆1,2,李生红1,2   

  1. 1 上海交通大学网络安全技术研究院,上海 200240
    2 上海市信息安全综合管理技术研究重点实验室,上海 200240
  • 修回日期:2020-09-23 出版日期:2020-12-15 发布日期:2020-12-16
  • 作者简介:杨潇(1993- ),男,重庆人,上海交通大学硕士生,主要研究方向为社交网络数据分析、自然语言处理|陈秀真(1977- ),女,山东聊城人,博士,上海交通大学副教授,主要研究方向为社交网络数据分析、车联网信息安全、安全检测与评估|马进(1977- ),女,山东滕州人,博士,上海交通大学高级工程师,主要研究方向为大数据与人工智能应用、车联网信息安全、网络空间安全综合管理新技术|梁浩喆(1994- ),男,广西柳州人,上海交通大学硕士生,主要研究方向为浏览器安全、分布式计算|李生红(1971- ),男,辽宁绥中人,博士,上海交通大学教授、博士生导师,主要研究方向为信息安全、人工智能
  • 基金资助:
    国家重点研发计划(2016YFB0801003);国家自然科学基金(61562004);国家自然科学基金(61431008)

User interests-based microblog tracing algorithm

Xiao YANG1,2,Xiuzhen CHEN1,2(),Jin MA1,2,Haozhe LIANG1,2,Shenghong LI1,2   

  1. 1 Institute of Cyber Science and Technology,Shanghai Jiaotong University,Shanghai 200240,China
    2 Shanghai Key Laboratory of Integrated Administration Technologies for Information Security,Shanghai 200240,China
  • Revised:2020-09-23 Online:2020-12-15 Published:2020-12-16
  • Supported by:
    The National Key R&D Program of China(2016YFB0801003);The National Natural Science Foundation of China(61562004);The National Natural Science Foundation of China(61431008)

摘要:

微博信息溯源通过分析在平台采集的话题数据集,挖掘相关话题的真正源头,即发布时间较早且影响力大的微博集合,实现网络舆论的管控与引导。提出一种基于用户兴趣的微博溯源算法,该算法根据博主的兴趣计算博主影响力,同时根据评论人、转发人的兴趣计算评论人、转发人的影响力,结合博主关注度和发表时间等因素,利用网页排序算法对微博评分,根据微博得分进行排序溯源。实验结果表明,该算法相较于传统溯源算法在查全率上提升了约21%。

关键词: 信息溯源, 微博, 兴趣, 影响力, 关注度

Abstract:

Microblog information tracing refers to finding the source set of microblog topics according to the analysis of crawled microblog texts and it’s of great significance in the aspect of public opinion control and information security.A user interests-based tracing method (ITM) was proposed.The proposed method calculates the influence of the blogger based on the interest of the microblog blogger,and also calculates the influence of the commentators based on the interest of the commentators.The ranking algorithm was used to score the blogs according to publication time,notability and influence,and the source of the blogs was traced according to the blog score rank.Experimental results show that the accuracy of the proposed algorithm improved about 21% compared with the traditional tracing algorithms.

Key words: information tracing, microblog, interest, influence, notability

中图分类号: