通信学报 ›› 2016, Vol. 37 ›› Issue (9): 75-91.doi: 10.11959/j.issn.1000-436x.2016180

• 学术论文 • 上一篇    下一篇

在线社交网络中Spam相册检测方案

吕少卿1,张玉清1,2,刘东航1,张光华1,3   

  1. 1 西安电子科技大学综合业务网理论及关键技术国家重点实验室,陕西 西安710071
    2 中国科学院大学国家计算机网络入侵防范中心,北京 100190
    3 中国科学院信息工程研究所物联网信息安全技术北京市重点实验室,北京 100097
  • 出版日期:2016-09-25 发布日期:2016-09-28
  • 基金资助:
    国家自然科学基金资助项目;国家自然科学基金资助项目;国家自然科学基金资助项目;物联网信息安全技术北京市重点实验室开放课题基金资助项目;中国博士后科学基金资助项目

Detecting Spam albums in online social network

Shao-qing LYU1,Yu-qing ZHANG1,2,Dong-hang LIU1,Guang-hua ZHANG1,3   

  1. 1 Information Security Research Center of State Key Laboratory of Integrated Services Networks,Xidian University,Xi'an 710071,China
    2 National Computer Network Intrusion Protection Center,University of Chinese Academy of Sciences,Beijing 100190,China
    3 Beijing Key Laboratory of IOT Information Security Technology,Institute of Information Engineering,CAS,Beijing 100097,China
  • Online:2016-09-25 Published:2016-09-28
  • Supported by:
    The National Natural Science Foundation of China;The National Natural Science Foundation of China;The National Natural Science Foundation of China;Open Fund of Beijing Key Laboratory of IOT Information Security Technology;China Postdoctoral Science Foundation

摘要:

提出一种针对Spam相册的检测方案。首先分析了Photo Spam的攻击特点以及与传统Spam的差异,在此基础上构造了12个提取及时且计算高效的特征。利用这些特征提出了有监督学习的检测模型,通过2 356个相册的训练形成Spam相册分类器,实验表明能够正确检测到测试集中100%的Spam相册和98.2%的正常相册。最后将训练后的模型应用到包含315 115个相册的真实数据集中,检测到89 163个Spam相册,正确率达到97.2%。

关键词: 社交网络安全, PhotoSpam, Spam检测, 人人网

Abstract:

A supervised learning solution to detect Spam albums instead of spammers in Photo Spam was proposed.Specifically,the characteristics of Photo Spam and the differences between Photo Spam and traditional Spam were analyzed.Then 12 features which were extracted easily and calculated efficiently were constructed based on the analysis.Next a classification model was built with a dataset of 2 356 labeled albums to identify Spam albums.The model provided excellent performance with true positive rates of Spam albums and normal albums,reaching 100% and 98.2% respectively.Finally,the detection model were applied to 315 115 unlabeled albums and detected 89 163 spam albums with a true positive rate of 97.2%.

Key words: social network security, Photo Spam, Spam detection, RenRen