网络与信息安全学报 ›› 2023, Vol. 9 ›› Issue (6): 86-101.doi: 10.11959/j.issn.2096-109x.2023085

• 学术论文 • 上一篇    

基于程序过程间语义优化的深度学习漏洞检测方法

李妍1,2,3, 羌卫中1,2,3, 李珍1,2,3, 邹德清1,2,3, 金海1,4   

  1. 1 大数据技术与系统国家地方联合工程研究中心服务计算技术与系统教育部重点实验室,湖北 武汉 430074
    2 分布式系统安全湖北省重点实验室,湖北 武汉 430074
    3 华中科技大学网络空间安全学院,湖北 武汉 430074
    4 华中科技大学计算机科学与技术学院,湖北 武汉 430074
  • 修回日期:2023-07-28 出版日期:2023-12-01 发布日期:2023-12-01
  • 作者简介:李妍(1998- ),女,陕西渭南人,华中科技大学硕士生,主要研究方向为深度学习和漏洞检测
    羌卫中(1977- ),男,江苏南通人,华中科技大学教授、博士生导师,主要研究方向为机密计算、云计算安全、软件安全
    李珍(1981- ),女,河北保定人,博士,华中科技大学副教授,主要研究方向为软件安全和人工智能安全
    邹德清(1975- ),男,湖南湘潭人,博士,华中科技大学教授、博士生导师,主要研究方向为云计算安全、网络攻防与漏洞检测、软件安全、隐私保护
    金海(1966- ),男,上海人,博士,华中科技大学教授、博士生导师,主要研究方向为计算机体系结构、虚拟化技术、集群计算和云计算、存储与安全
  • 基金资助:
    国家自然科学基金(62272187);国家通用技术基础研究联合基金(U1936211)

Deep learning vulnerability detection method based on optimized inter-procedural semantics of programs

Yan LI1,2,3, Weizhong QIANG1,2,3, Zhen LI1,2,3, Deqing ZOU1,2,3, Hai JIN1,4   

  1. 1 Services Computing Technology and System Lab, National Engineering Research Center for Big Data Technology, Wuhan 430074, China
    2 Hubei Key Laboratory of Distributed System Security, Wuhan 430074, China
    3 School of Cyber Science and Engineering, Huazhong University of Science and Technology, Wuhan 430074, China
    4 School of Computer Science and Engineering, Huazhong University of Science and Technology, Wuhan 430074, China
  • Revised:2023-07-28 Online:2023-12-01 Published:2023-12-01
  • Supported by:
    The National Natural Science Foundation of China(62272187);The Joint Funds of the National Natural Science Foundation of China(U1936211)

摘要:

近年来,软件漏洞引发的安全事件层出不穷,及早发现并修补漏洞能够有效降低损失。传统的基于规则的漏洞检测方法依赖于专家定义规则,存在较高的漏报率,基于深度学习的方法能够自动学习漏洞程序的潜在特征,然而随着软件复杂程度的提升,该类方法在面对真实软件时存在精度下降的问题。一方面,现有方法执行漏洞检测时大多在函数级工作,无法处理跨函数的漏洞样例;另一方面,BGRU和BLSTM等模型在输入序列过长时性能下降,不善于捕捉程序语句间的长期依赖关系。针对上述问题,优化了现有的程序切片方法,结合过程内和过程间切片对跨函数的漏洞进行全面的上下文分析以捕获漏洞触发的完整因果关系;应用了包含多头注意力机制的 Transformer 神经网络模型执行漏洞检测任务,共同关注来自不同表示子空间的信息来提取节点的深层特征,相较于循环神经网络解决了信息衰减的问题,能够更有效地学习源程序的语法和语义信息。实验结果表明,该方法在真实软件数据集上的 F1 分数达到了 73.4%,相较于对比方法提升了13.6%~40.8%,并成功检测出多个开源软件漏洞,证明了其有效性与实用性。

关键词: 漏洞检测, 程序切片, 深度学习, 注意力机制

Abstract:

In recent years, software vulnerabilities have been causing a multitude of security incidents, and the early discovery and patching of vulnerabilities can effectively reduce losses.Traditional rule-based vulnerability detection methods, relying upon rules defined by experts, suffer from a high false negative rate.Deep learning-based methods have the capability to automatically learn potential features of vulnerable programs.However, as software complexity increases, the precision of these methods decreases.On one hand, current methods mostly operate at the function level, thus unable to handle inter-procedural vulnerability samples.On the other hand, models such as BGRU and BLSTM exhibit performance degradation when confronted with long input sequences, and are not adept at capturing long-term dependencies in program statements.To address the aforementioned issues, the existing program slicing method has been optimized, enabling a comprehensive contextual analysis of vulnerabilities triggered across functions through the combination of intra-procedural and inter-procedural slicing.This facilitated the capture of the complete causal relationship of vulnerability triggers.Furthermore, a vulnerability detection task was conducted using a Transformer neural network architecture equipped with a multi-head attention mechanism.This architecture collectively focused on information from different representation subspaces, allowing for the extraction of deep features from nodes.Unlike recurrent neural networks, this approach resolved the issue of information decay and effectively learned the syntax and semantic information of the source program.Experimental results demonstrate that this method achieves an F1 score of 73.4% on a real software dataset.Compared to the comparative methods, it shows an improvement of 13.6% to 40.8%.Furthermore, it successfully detects several vulnerabilities in open-source software, confirming its effectiveness and applicability.

Key words: vulnerability detection, program slice, deep learning, attention mechanism

中图分类号: 

No Suggested Reading articles found!