网络与信息安全学报 ›› 2017, Vol. 3 ›› Issue (8): 44-60.doi: 10.11959/j.issn.2096-109x.2017.00186

• 学术论文 • 上一篇    下一篇

基于两层分类器的恶意网页快速检测系统研究

王正琦1,2(),冯晓兵1,2,张驰1,2   

  1. 1 中国科学技术大学,安徽 合肥 230026
    2 中国科学院电磁空间信息重点实验室,安徽 合肥 230026
  • 修回日期:2017-07-22 出版日期:2017-08-01 发布日期:2017-12-26
  • 作者简介:王正琦(1992-),男,江苏镇江人,中国科学技术大学硕士生,主要研究方向为网络安全。|冯晓兵(1992-),女,山东聊城人,中国科学技术大学硕士生,主要研究方向为网络安全。|张驰(1977-),男,中国科学技术大学副教授、博士生导师,主要研究方向为计算机网络、信息安全。
  • 基金资助:
    国家自然科学基金资助项目(61202140);国家自然科学基金资助项目(61328208)

Study of high-speed malicious Web page detection system based on two-step classifier

Zheng-qi WANG1,2(),Xiao-bing FENG1,2,Chi ZHANG1,2   

  1. 1 University of Science and Technology of China,Hefei 230026,China
    2 Key Laboratory of Electromagnetic Space Information,Chinese Academy of Sciences,Hefei 230026,China
  • Revised:2017-07-22 Online:2017-08-01 Published:2017-12-26
  • Supported by:
    The National Natural Science Foundation of China(61202140);The National Natural Science Foundation of China(61328208)

摘要:

针对当前传统静态恶意网页检测方案在面对海量的新增网页时面临的压力,引入了两段式的分析检测过程,并依次为每段检测提出相应的特征提取方案,通过层次化使用优化的朴素贝叶斯算法和支持向量机算法,设计并实现了一种兼顾效率和功能的恶意网页检测系统——TSMWD(two-step malicious Web page detection system)。第一层检测系统用于过滤大量的正常网页,其特点为效率高、速度快、更新迭代容易,真正率优先。第二层检测系统追求性能,对于检测的准确率要求较高,时间和资源的开销上适当放宽。实验结果表明,该架构能够在整体检测准确率基本不变的情况下,提高系统的检测速度,在时间一定的情况下,接纳更多的检测请求。

关键词: 恶意网页检测, 网络安全, 机器学习, 特征提取

Abstract:

In view of the increasing number of new Web pages and the increasing pressure of traditional detection methods,the naive Bayesian algorithm and the support vector machine algorithm were used to design and implement a malicious Web detection system with both efficiency and function,TSMWD ,two-step malicious Web page detection.The first step of detection system was mainly used to filter a large number of normal Web pages,which was characterized by high efficiency,speed,update iteration easy,real rate priority.After the former filter,due to the limited number of samples,the main pursuit of the second step was the detection rate.The experimental results show that the proposed scheme can improve the detection speed of the system under the condition that the overall detection accuracy is basically the same,and can accept more detection requests in certain time.

Key words: malicious Web page detection, network security, machine learning, feature extraction

中图分类号: