通信学报 ›› 2018, Vol. 39 ›› Issue (8): 69-82.doi: 10.11959/j.issn.1000-436x.2018148

• 论文Ⅰ:人工智能与网络安全 • 上一篇    下一篇

DeepRD:基于Siamese LSTM网络的Android重打包应用检测方法

汪润1,2,唐奔宵1,2,王丽娜1,2()   

  1. 1 空天信息安全与可信计算教育部重点实验室,湖北 武汉 430072
    2 武汉大学国家网络安全学院,湖北 武汉 430072
  • 修回日期:2018-06-28 出版日期:2018-08-01 发布日期:2018-09-13
  • 作者简介:汪润(1991-),男,安徽安庆人,武汉大学博士生,主要研究方向为 Android 安全与隐私、AI安全等。|唐奔宵(1991-),男,湖北黄石人,武汉大学博士生,主要研究方向为移动安全与隐私、系统安全等。|王丽娜(1964-),女,辽宁营口人,博士,武汉大学教授、博士生导师,主要研究方向为网络安全、信息隐藏、AI安全等。
  • 基金资助:
    国家自然科学基金资助项目(U1536204);中央高校基本科研业务费专项资金资助项目(2042018kf1028);国家高技术研究发展计划(“863”计划)基金资助项目(2015AA016004)

DeepRD:LSTM-based Siamese network for Android repackaged applications detection

Run WANG1,2,Benxiao TANG1,2,Li’na WANG1,2()   

  1. 1 Key Laboratory of Aerospace Information Security and Trusted Computing Ministry of Education,Wuhan University,Wuhan 430072,China
    2 School of Cyber Science and Engineering,Wuhan University,Wuhan 430072,China
  • Revised:2018-06-28 Online:2018-08-01 Published:2018-09-13
  • Supported by:
    The National Natural Science Foundation of China(U1536204);The Central University Basic Business Expenses Special Funding for Scientific Research Project(2042018kf1028);The National High Technology Research and Development Program of China(2015AA016004)

摘要:

目前,Android 平台重打包应用检测方法依赖于专家定义特征,不但耗时耗力,而且其特征容易被攻击者猜测。另外,现有的应用特征表示难以在常见的重打包应用类型检测中取得良好的效果,导致在实际检测中存在漏报率较高的现象。针对以上2个问题,提出了一种基于深度学习的重打包应用检测方法,自动地学习程序的语义特征表示。首先,对应用程序进行控制流与数据流分析形成序列特征表示;然后,根据词向量嵌入模型将序列特征转变为特征向量表示,输入孪生网络长短期记忆(LSTM,long short term memory)网络中进行程序特征自学习;最后,将学习到的程序特征通过相似性度量实现重打包应用的检测。在公开数据集AndroZoo上测试发现,重打包应用检测的精准率达到95.7%,漏报率低于6.2%。

关键词: 重打包, 深度学习, 孪生网络, 长短期记忆, 安全与隐私

Abstract:

The state-of-art techniques in Android repackaging detection relied on experts to define features,however,these techniques were not only labor-intensive and time-consuming,but also the features were easily guessed by attackers.Moreover,the feature representation of applications which defined by experts cannot perform well to the common types of repackaging detection,which caused a high false negative rate in the real detection scenario.A deep learning-based repackaged applications detection approach was proposed to learn the program semantic features automatically for addressing the above two issues.Firstly,control and data flow analysis were taken for applications to form a sequence feature representation.Secondly,the sequence features were transformed into vectors based on word embedding model to train a Siamese LSTM network for automatically program feature learning.Finally,repackaged applications were detected based on the similarity measurement of learned program features.Experimental results show that the proposed approach achieves a precision of 95.7% and false negative rate of 6.2% in an open sourced dataset AndroZoo.

Key words: repackaging, deep learning, Siamese network, LSTM, security and privacy

中图分类号: 

No Suggested Reading articles found!