网络与信息安全学报 ›› 2024, Vol. 10 ›› Issue (1): 123-135.doi: 10.11959/j.issn.2096-109x.2024002

• 学术论文 • 上一篇    

基于注意力与门控机制的多特征融合恶意软件检测方法

陈仲元, 张建标   

  1. 北京工业大学信息学部计算机学院,北京 100124
  • 修回日期:2024-01-03 出版日期:2024-02-01 发布日期:2024-02-01
  • 作者简介:陈仲元(1998− ),男,河南周口人,北京工业大学硕士生,主要研究方向为网络与信息安全、深度学习、可信计算
    张建标(1969− ),男,江苏海门人,北京工业大学教授、博士生导师,主要研究方向为可信计算、系统安全、云安全
  • 基金资助:
    北京市自然科学基金(M21039)

Multi-feature fusion malware detection method based on attention and gating mechanisms

Zhongyuan CHEN, Jianbiao ZHANG   

  1. School of Computer Science, Department of Information Science, Beijing University of Technology, Beijing 100124, China
  • Revised:2024-01-03 Online:2024-02-01 Published:2024-02-01
  • Supported by:
    Beijing Natural Science Foundation(M21039)

摘要:

随着网络技术的飞速发展,恶意软件及其变种的数量不断增加,这使得恶意软件的检测成为网络安全领域面临的一大挑战。然而,现有的单一特征恶意软件检测方法在样本信息的表示上存在不足,而对于采用多特征的检测方法,它们在特征融合方面存在局限,未能有效地学习和理解特征内部及特征间的复杂关联,这些问题都会导致检测效果不佳。提出了一种基于多模态特征融合的恶意软件检测方法—— MFAGM。通过处理数据集的.asm和.bytes文件,成功提取了两种类型的3种关键特征(操作码统计序列、API序列和灰度图像特征),实现了从多个角度全面地表征样本信息。为了更好地融合这些多模态特征,设计了一个特征融合模块 SA-JGmu。该模块不仅采用自注意力机制捕获特征之间的内部依赖关系,还利用门控机制增强了不同特征的交互性,并巧妙地引入了权重跳跃连接以进一步优化模型的表示能力。最终,基于MMCC(Microsoft malware classification challenge)数据集的实验结果显示,MFAGM在恶意软件检测任务上与其他方法相比,达到了更高的准确率和F1分数。

关键词: 恶意软件检测, 深度学习, 特征融合, 多模态学习, 静态分析

Abstract:

With the rapid development of network technology, the number and variety of malware have been increasing, posing a significant challenge in the field of network security.However, existing single-feature malware detection methods have proven inadequate in representing sample information effectively.Moreover, multi-feature detection approaches also face limitations in feature fusion, resulting in an inability to learn and comprehend the complex relationships within and between features.These limitations ultimately lead to subpar detection results.To address these issues, a malware detection method called MFAGM was proposed, which focused on multimodal feature fusion.By processing the .asm and .bytes files of the dataset, three key features belonging to two types (opcode statistics sequences, API sequences, and grey-scale image features) were successfully extracted.This comprehensive characterization of sample information from multiple perspectives aimed to improve detection accuracy.In order to enhance the fusion of these multimodal features, a feature fusion module called SA-JGmu was designed.This module utilized the self-attention mechanism to capture internal dependencies between features.It also leveraged the gating mechanism to enhance interactivity among different features.Additionally, weight-jumping links were introduced to further optimize the representational capabilities of the model.Experimental results on the Microsoft malware classification challenge dataset demonstrate that MFAGM achieves higher accuracy and F1 scores compared to other methods in the task of malware detection.

Key words: malware detection, deep learning, feature fusion, multimodal learning, static analysis

中图分类号: 

No Suggested Reading articles found!