网络与信息安全学报 ›› 2020, Vol. 6 ›› Issue (6): 112-120.doi: 10.11959/j.issn.2096-109x.2020084

• 学术论文 • 上一篇    下一篇

基于可分离卷积的轻量级恶意域名检测模型

杨路辉1(),白惠文1,刘光杰1,2,戴跃伟1,2   

  1. 1 南京理工大学自动化学院,江苏 南京 210094
    2 南京信息工程大学电子与信息工程学院,江苏 南京 210044
  • 修回日期:2020-05-21 出版日期:2020-12-15 发布日期:2020-12-16
  • 作者简介:杨路辉(1992- ),男,江西黎川人,南京理工大学博士生,主要研究方向为网络与信息安全|白惠文(1992- ),男,吉林白山人,南京理工大学博士生,主要研究方向为网络流量分析|刘光杰(1980- ),男,江苏徐州人,博士,南京信息工程大学教授、博士生导师,主要研究方向为信息安全、多媒体系统、深度学习|戴跃伟(1962- ),男,江苏镇江人,博士,南京信息工程大学教授、博士生导师,主要研究方向为网络与多媒体信息安全
  • 基金资助:
    国家自然科学基金(U1836104)

Lightweight malicious domain name detection model based on separable convolution

Luhui YANG1(),Huiwen BAI1,Guangjie LIU1,2,Yuewei DAI1,2   

  1. 1 School of Automation,Nanjing University of Science and Technology,Nanjing 210094,China
    2 School of Electronic &Information Engineering,Nanjing University of Information Science and Technology,Nanjing 210044,China
  • Revised:2020-05-21 Online:2020-12-15 Published:2020-12-16
  • Supported by:
    The National Natural Science Foundation of China(U1836104)

摘要:

考虑到基于深度学习的恶意域名检测方法计算开销大,难以有效应用于真实网络场景域名检测实际,设计了一种基于可分离卷积的轻量级恶意域名检测算法。该模型使用可分离卷积结构,能够对卷积过程中的每一个输入通道进行深度卷积,然后对所有输出通道进行逐点卷积,在不减少卷积特征提取效果的情况下,有效减少卷积过程的参数量,实现更加快速的卷积过程并不降低模型的准确性。同时,为了减轻模型训练过程中正负样本数量不平衡与样本难易程度不平衡的情况对模型分类准确率的影响,引入了一种聚焦损失函数。所提算法在公开数据集上与 3 种典型的基于深度神经网络的检测模型进行对比,实验结果表明,算法能够达到与目前最优模型接近的检测准确率,同时能够显著提升在CPU上的模型推理速度。

关键词: 可分离卷积, 域名生成算法, 深度学习, 网络安全

Abstract:

The application of artificial intelligence in the detection of malicious domain names needs to consider both accuracy and calculation speed,which can make it closer to the actual application.Based on the above considerations,a lightweight malicious domain name detection model based on separable convolution was proposed.The model uses a separable convolution structure.It first applies depthwise convolution on every input channel,and then performs pointwise convolution on all output channels.This can effectively reduce the parameters of convolution process without impacting the effectiveness of convolution feature extraction,and realize faster convolution process while keeping high accuracy.To improve the detection accuracy considering the imbalance of the number and difficulty of positive and negative samples,a focal loss function was introduced in the training process of the model.The proposed algorithm was compared with three typical deep-learning-based detection models on a public data set.Experimental results denote that the proposed algorithm achieves detection accuracy close to the state-of-the-art model,and can significantly improve model inference speed on CPU.

Key words: separable convolution, domain generation algorithm, deep learning, cyber security

中图分类号: