通信学报 ›› 2019, Vol. 40 ›› Issue (6): 1-13.doi: 10.11959/j.issn.1000-436x.2019149

• 专题:网络攻防与安全度量 •    下一篇

面向中文用户评论的自动化众包攻击方法

王丽娜1,2,郭晓东1,2,汪润1,2   

  1. 1 武汉大学空天信息安全与可信计算教育部重点实验室,湖北 武汉 430072
    2 武汉大学国家网络安全学院,湖北 武汉 430072
  • 修回日期:2019-05-31 出版日期:2019-06-25 发布日期:2019-07-04
  • 作者简介:王丽娜(1964– ),女,辽宁营口人,博士,武汉大学教授、博士生导师,主要研究方向为软件和系统安全、信息隐藏、人工智能安全等。|郭晓东(1994– ),男,山西大同人,武汉大学硕士生,主要研究方向为自然语言处理、人工智能安全等。|汪润(1991– ),男,安徽安庆人,武汉大学博士,主要研究方向为软件和系统安全、人工智能安全等。
  • 基金资助:
    国家自然科学基金资助项目(61876134);国家自然科学基金资助项目(U183610015);中央高校基本科研业务费专项基金资助项目(2042018kf1028)

Automated crowdturfing attack in Chinese user reviews

WANG Li’na1,2,GUO Xiaodong1,2,WANG Run1,2   

  1. 1 Key Laboratory of Aerospace Information Security and Trusted Computing Ministry of Education,Wuhan University,Wuhan 430072,China
    2 School of Cyber Science and Engineering,Wuhan University,Wuhan 430072,China
  • Revised:2019-05-31 Online:2019-06-25 Published:2019-07-04
  • Supported by:
    The National Natural Science Foundation of China(61876134);The National Natural Science Foundation of China(U183610015);The Central University Basic Business Expenses Special Funding for Scientific Research Project(2042018kf1028)

摘要:

面向文本的自动化众包攻击具有攻击成本低、隐蔽性强等特点,这种攻击可以自动生成大量虚假评论,影响用户评论社区的健康发展。近些年来,有学者研究面向英文评论社区的文本自动化众包攻击,但是鲜有针对中文评论社区的自动化众包攻击的研究,针对这一不足,提出了基于汉字嵌入LSTM模型的中文文本自动化生成攻击方法。通过训练由汉字嵌入网络、LSTM网络和Softmax全连接网络组成的多层网络模型,并引入温度参数T构建攻击模型。实验中,从淘宝网的在线用户评论中抓取了超过5万条真实的用户评论数据,验证所提攻击方法的有效性。实验结果表明,生成的虚假评论可以有效地欺骗基于语言学分析的分类检测方法和基本文本拷贝检测等方法,并且通过大量的人工评估实验发现所生成的文本具有真实性强、类型广等特点。

关键词: 用户评论社区, 自动化众包攻击, 汉字嵌入网络, 长短期记忆网络

Abstract:

The text-oriented automated crowdturfing attack has a series of features such as low attack cost and strong concealment.This kind of attack can automatically generate a large number of fake reviews,with harmful effect on the healthy development of the user review community.In recent years,researchers have found that text-oriented crowdturfing attacks for the English review community,but there was few research work on automated crowdsourcing attacks in the Chinese review community.A Chinese character embedding LSTM model was proposed to automatically generate Chinese reviews with the aim of antomated crowdturfing attacks,which model trained by a combination with Chinese character embedding network,LSTM network and softmax dense network,and a temperature parameter T was designed to construct the attack model.In the experiment,more than 50 000 real user reviews were crawled from Taobao's online review platform to verify the effectiveness of the attack method.Experimental results show that the generated fake reviews can effectively fool linguistics-based classification detection approach and texts plagiarism detection approach.Besides,the massive manually evaluation experiments also demonstrate that the generated reviews with the proposed attack approach perform well in reality and diversity.

Key words: user review community, automated crowdturfing attack, Chinese character embedding network, LSTM

中图分类号: 

  • TP309.1