通信学报 ›› 2021, Vol. 42 ›› Issue (11): 13-27.doi: 10.11959/j.issn.1000-436x.2021198

• 专题:计算机通信与网络系统安全技术 • 上一篇    下一篇

上下文感知的安卓应用程序漏洞检测研究

秦佳伟1,2, 张华1, 严寒冰2, 何能强2, 涂腾飞1   

  1. 1 北京邮电大学网络与交换技术国家重点实验室,北京 100876
    2 国家计算机网络应急技术处理协调中心,北京 100029
  • 修回日期:2021-09-27 出版日期:2021-11-25 发布日期:2021-11-01
  • 作者简介:秦佳伟(1993− ),男,满族,辽宁本溪人,北京邮电大学博士生,国家计算机网络应急技术处理协调中心工程师,主要研究方向为移动端安全分析、物联网安全分析等
    张华(1978− ),女,吉林四平人,博士,北京邮电大学副教授,主要研究方向为网络安全、隐私保护等
    严寒冰(1975− ),男,江西进贤人,博士,国家计算机网络应急技术处理协调中心教授级工程师,主要研究方向为网络安全、计算机图形学等
    何能强(1985− ),男,浙江义乌人,博士,国家计算机网络应急技术处理协调中心高级工程师,主要研究方向为移动恶意程序分析、应用程序安全检测等
    涂腾飞(1990− ),男,山东临沂人,博士,北京邮电大学在站博士后,主要研究方向为网络安全、移动安全等
  • 基金资助:
    国家自然科学基金资助项目(62072051);国家自然科学基金资助项目(61976024);国家自然科学基金资助项目(61972048);中央高校基本科研业务费专项资金资助项目(2019XD-A01);教育部区块链核心计划基金资助项目(2020KJ010802)

Research on context-aware Android application vulnerability detection

Jiawei QIN1,2, Hua ZHANG1, Hanbing YAN2, Nengqiang HE2, Tengfei TU1   

  1. 1 State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing 100876, China
    2 The National Computer Network Emergency Response Technical Team/Coordination Center of China, Beijing 100029, China
  • Revised:2021-09-27 Online:2021-11-25 Published:2021-11-01
  • Supported by:
    The National Natural Science Foundation of China(62072051);The National Natural Science Foundation of China(61976024);The National Natural Science Foundation of China(61972048);The Fundamental Research Funds for the Central Universities(2019XD-A01);Key Project Plan of Blockchain in Ministry of Education(2020KJ010802)

摘要:

针对基于学习的安卓应用程序的漏洞检测模型对源程序的特征提取结果欠缺语义信息,且提取的特征化结果包含与漏洞信息无关的噪声数据,导致漏洞检测模型的准确率下降的问题,提出了一种基于代码切片(CIS)的程序特征提取方法。该方法和抽象语法树(AST)特征方法相比可以更加精确地提取和漏洞存在直接关系的变量信息,避免引入过多噪声数据,同时可以体现漏洞的语义信息。利用CIS,基于Bi-LSTM和注意力机制提出了一个上下文感知的安卓应用程序漏洞检测模型VulDGArcher;针对安卓漏洞数据集不易获得的问题,构建了一个包含隐式Intent通信漏洞和PendingIntent权限绕过漏洞的41 812个代码片段的数据集,其中漏洞代码片段有16 218个。在这个数据集上,VulDGArcher检测准确率可以达到96%,高于基于AST特征和未进行处理的APP源码特征的深度学习漏洞检测模型。

关键词: 安卓漏洞检测, 深度学习, 代码切片, 漏洞语义特征

Abstract:

The vulnerability detection model of Android application based on learning lacks semantic features.The extracted features contain noise data unrelated to vulnerabilities, which leads to the false positive of vulnerability detection model.A feature extraction method based on code information slice (CIS) was proposed.Compared with the abstract syntax tree (AST) feature method, the proposed method could extract the variable information directly related to vulnerabilities more accurately and avoid containing too much noise data.It contained semantic information of vulnerabilities.Based on CIS and BI-LSTM with attention mechanism, a context-aware Android application vulnerability detection model VulDGArcher was proposed.For the problem that the Android vulnerability data set was not easy to obtain, a data set containing 41 812 code fragments including the implicit Intent security vulnerability and the bypass PendingIntent permission audit vulnerability was built.There were 16 218 code fragments of vulnerability.On this data set, VulDGArcher’s detection accuracy can reach 96%, which is higher than the deep learning vulnerability detection model based on AST features and APP source code features.

Key words: Android vulnerability detection, deep learning, CIS, semantic characteristics of vulnerabilities

中图分类号: 

No Suggested Reading articles found!