网络与信息安全学报 ›› 2021, Vol. 7 ›› Issue (3): 37-45.doi: 10.11959/j.issn.2096-109x.2021039

• 专栏Ⅰ:神经网络技术应用 • 上一篇    下一篇

基于图神经网络的代码漏洞检测方法

陈皓, 易平   

  1. 上海交通大学网络空间安全学院,上海 200240
  • 修回日期:2020-12-15 出版日期:2021-06-15 发布日期:2021-06-01
  • 作者简介:陈皓(1995- ),男,山东潍坊人,上海交通大学硕士生,主要研究方向为深度学习与漏洞检测
    易平(1969- ),男,河南洛阳人,博士,上海交通大学副教授,主要研究方向为人工智能安全
  • 基金资助:
    国家重点研发计划(2019YFB1405000);国家重点研发计划(2017YFB0802900)

Code vulnerability detection method based on graph neural network

Hao CHEN, Ping YI   

  1. School of Cyber Science and Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
  • Revised:2020-12-15 Online:2021-06-15 Published:2021-06-01
  • Supported by:
    The National Key R&D Program of China(2019YFB1405000);The National Key R&D Program of China(2017YFB0802900)

摘要:

使用神经网络进行漏洞检测的方案大多基于传统自然语言处理的思路,将源代码当作序列样本处理,忽视了代码中所具有的结构性特征,从而遗漏了可能存在的漏洞。提出了一种基于图神经网络的代码漏洞检测方法,通过中间语言的控制流图特征,实现了函数级别的智能化代码漏洞检测。首先,将源代码编译为中间表示,进而提取其包含结构信息的控制流图,同时使用词向量嵌入算法初始化基本块向量提取代码语义信息;然后,完成拼接生成图结构样本数据,使用多层图神经网络对图结构数据特征进行模型训练和测试。采用开源漏洞样本数据集生成测试数据对所提方法进行了评估,结果显示该方法有效提高了漏洞检测能力。

关键词: 漏洞检测, 图神经网络, 控制流图, 中间表示

Abstract:

The schemes of using neural networks for vulnerability detection are mostly based on traditional natural language processing ideas, processing the code as array samples and ignoring the structural features in the code, which may omit possible vulnerabilities.A code vulnerability detection method based on graph neural network was proposed, which realized function-level code vulnerability detection through the control flow graph feature of the intermediate language.Firstly, the source code was compiled into an intermediate representation, and then the control flow graph containing structural information was extracted.At the same time, the word vector embedding algorithm was used to initialize the vector of basic block to extract the code semantic information.Then both of above were spliced to generate the graph structure sample data.The multilayer graph neural network model was trained and tested on graph structure data features.The open source vulnerability sample data set was used to generate test data to evaluate the method proposed.The results show that the method effectively improves the vulnerability detection ability.

Key words: vulnerability detection, graph neural network, control flow graph, intermediate representation

中图分类号: 

No Suggested Reading articles found!