通信学报 ›› 2016, Vol. 37 ›› Issue (11): 104-113.doi: 10.11959/j.issn.1000-436x.2016225
乔延臣1,2,3,云晓春1,2,3,庹宇鹏2,3(),张永铮2,3
出版日期:
2016-11-25
发布日期:
2016-11-30
基金资助:
Yan-chen QIAO1,2,3,Xiao-chun YUN1,2,3,Yu-peng TUO2,3(),Yong-zheng ZHANG2,3
Online:
2016-11-25
Published:
2016-11-30
Supported by:
摘要:
提出了一种新颖的复用代码精确快速溯源方法。该方法以函数为单位,基于simhash与倒排索引技术,能在海量代码中快速溯源相似函数。首先基于simhash利用海量样本构建具有三级倒排索引结构的代码库。对于待溯源函数,依据函数中代码块的simhash值快速发现相似代码块,继而倒排索引潜在相似函数,依据代码块跳转关系精确判定是否相似,并溯源至所在样本。实验结果表明,该方法在保证高准确率与召回率的前提下,基于代码库能快速识别样本中的编译器插入函数与复用函数。
乔延臣,云晓春,庹宇鹏,张永铮. 基于simhash与倒排索引的复用代码快速溯源方法[J]. 通信学报, 2016, 37(11): 104-113.
Yan-chen QIAO,Xiao-chun YUN,Yu-peng TUO,Yong-zheng ZHANG. Fast reused code tracing method based on simhash and inverted index[J]. Journal on Communications, 2016, 37(11): 104-113.
表4
编译器函数溯源详情"
编译器函数 | WinXP系统文件中的相似函数 |
sub_40451A | imjputyc.dll中的___old_sbh_decommit_pages等 |
sub_402A5B | TPPS.DLL中的sub_10005660等 |
sub_4023E9 | TPVMW32.dll中的sub_10006046等 |
sub_4044C4 | MSCOMCTL.OCX中的sub_27608F15等 |
sub_4023BC | tprdpw32.dll中的sub_10009720等 |
sub_404380 | imjpuex.exe中的___old_sbh_new_region等 |
sub_4045DC | tprdpw32.dll中的sub_1000A73D等 |
sub_404880 | msvcr70.dll中的___old_sbh_alloc_block_from_page等 |
sub_402531 | tprdpw32.dll中的sub_10009895等 |
sub_404678 | imjpdct.dll中的___old_sbh_alloc_block等 |
sub_402E2A | imjprw.exe中的_calloc等 |
sub_403BA2 | imjpmig.exe中的___sbh_free_block等 |
sub_40292A | imjpcus.dll中的__heap_alloc等 |
sub_402799 | imskdic.dll中的__NMSG_WRITE等 |
sub_404633 | TPVMW32.dll中的sub_1000708D等 |
sub_401233 | tprdpw32.dll中的sub_1000F109等 |
表5
Agobot样本溯源实验结果"
非Agobot家族样本 | 相似函数数量 | 总指令数 |
Net-Worm.Win32.Kolabc.eph | 44 | 2 411 |
Trojan-Spy.Win32.SCKeyLog.fp | 44 | 1 048 |
Trojan-Spy.Win32.SCKeyLog.ij | 43 | 1 141 |
Backdoor.Win32.IRCBot.ol | 40 | 4 036 |
Trojan-Dropper.Win32.Agent.tif | 31 | 712 |
Trojan-Spy.Win32.SCKeyLog.fb | 29 | 669 |
Trojan-Downloader.Win32.Delf.gij | 28 | 711 |
Backdoor.Win32.SuperSpy.b | 28 | 665 |
Trojan-Downloader.Win32.Realtens.h | 28 | 619 |
Trojan-PSW.Win32.Zombie.10 | 26 | 646 |
表6
Zlob家族溯源实验结果"
恶意代码家族 | 关联样本数量 |
Trojan.Win32.Vapsup | 2 566 |
Trojan.Win32.Agent | 124 |
Trojan-Downloader.Win32.Agent | 114 |
Trojan-Clicker.Win32.Agent | 69 |
Trojan-Ransom.Win32.Hexzone | 58 |
Trojan-GameThief.Win32.OnLineGames | 47 |
Backdoor.Win32.Rbot | 45 |
Backdoor.Win32.Agent | 43 |
Trojan.Win32.BHO | 40 |
Backdoor.Win32.SdBot | 25 |
[1] | 董志强, 肖新光, 张栗伟 . 编码心理学分析病毒同源性[J]. 信息安全与通信保密, 2005(8):55-59. DONG Z Q , XIAO X G , ZHANG S W . Malware homology identifica-tion based on programming psychology[J]. China Information Security, 2005(8):55-59. |
[2] | GReAT . Gauss: abnormal distribution 2012[R/OL]. |
[3] | YURY Y , NAMESTNIKOV V K , OLEG K . Chthonic: a new modification of ZeuS 2014[R/OL]. . |
[4] | SKELORU V . Visgean/Zeus[EB/OL]. . |
[5] | GREAT . A fanny equation: “i am your father, stuxnet”2015[EB/OL]. . |
[6] | QIAO Y C , YUN X , ZHANG Y . Fast reused function retrieval method based on simhash and inverted index[C]// 2016 15th IEEE Interna-tional Conference on Trust, Security and Privacy in Computing and Communications. 2016. |
[7] | BENCSATH B , PEK G , BUTTYAN L , et al Duqu: a stuxnet-like malware found in the wild[R]. CrySyS Lab Technical Report. 2011. |
[8] | GREAT . Cloud Atlas: RedOctober APT is back in style 2014[R/OL]. . |
[9] | LABS F S . PITOU: The “silent” resurrection of the notorious Srizbi kernel spambot[R]. . 2014. |
[10] | MYLES G , COLLBERG C , K-gram based software birthmarks[C]// Proceedings of the 2005 ACM Symposium on Applied Computing. 2005:314-318. |
[11] | S?BJ?RNSEN A , WILLCOCK J , PANAS T , et al. Detecting code clones in binary executables[C]// 18th International Symposium on Software Testing and Analysis. 2009:117-128. |
[12] | LAKHOTIA A , PREDA M D , GIACOBAZZI R . Fast location of similar code fragments using semantic'juice'[C]// 2nd ACM SIGPLAN Program Protection and Reverse Engineering Workshop. 2013:1-6. |
[13] | RUTTENBERG B , MILES C , KELLOGG L , et al. Identifying shared software components to support malware forensics[J]. Detection of In-trusions and Malware, and Vulnerability Assessment: Springer, 2014,21-40. |
[14] | OUELLETTE J , PFEFFER A , LAKHOTIA A , et al. Countering malware evolution using cloud-based learning[C]// 2013 8th International Con-ference on Malicious and Unwanted Software, 2013. |
[15] | DAVID Y , YAHAV E , Tracelet-based code search in executables[C]// ACM SIGPLAN Notices. 2014. |
[16] | ALRABAEE S , SHIRANI P , WANG L , et al. SIGMA: a semantic inte-grated graph matching approach for identifying reused functions in binary code[J]. Digital Investigation, 2015,12:S61-S71. |
[17] | CHARIKAR M S . Similarity estimation techniques from rounding algorithms[C]// 34th Annual ACM Symposium on Theory of Comput-ing. 2002. |
[18] | MANKU G S , JAIN A , SARMA A D . Detecting near-duplicates for web crawling[C]// 16th International Conference on World Wide Web. Banff, Alberta, Canada, 2007:141-50. |
[19] | UDDIN M S , ROY C K , SCHNEIDER K A , et al. On the effectiveness of simhash for detecting near-miss clones in large scale software sys-tems[C]// 2011 18th Working Conference on Reverse Engineering(WCRE), 2011. |
[20] | 郭颖, 陈峰宏, 周明辉 . 大规模代码克隆的检测方法[J]. 计算机科学与探索, 2014(4):417-426. GUO Y , CHEN F H , ZHOU M H . Code clone detection method for large scale source code[J]. Journal of Frontiers of Computer Science &Technology, 2014(4):417-426. |
[21] | TIMO J , RINNE S L . ssh-3.2.9.1 2003[EB/OL]. . |
[22] | VX Heaven[EB/OL]. . |
[23] | Wikipedia . Agobot 2016[EB/OL]. . |
[24] | KHOO W M , MYCROFT A , ANDERSON R , et al. Rendezvous: a search engine for binary code[C]// Proceedings of the 10th Working Confer-ence on Mining Software Repositories. 2013. |
[1] | 蒋侣,张恒巍,王晋东. 基于信号博弈的移动目标防御最优策略选取方法[J]. 通信学报, 2019, 40(6): 128-137. |
[2] | 黄世锐,张恒巍,王晋东,窦睿彧. 基于定性微分博弈的网络安全威胁预警方法[J]. 通信学报, 2018, 39(8): 29-36. |
[3] | 臧小东,龚俭,胡晓艳. 基于AGD的恶意域名检测[J]. 通信学报, 2018, 39(7): 15-25. |
[4] | 刘亚姝,王志海,严寒冰,侯跃然,来煜坤. 抗混淆的恶意代码图像纹理特征描述方法[J]. 通信学报, 2018, 39(11): 44-53. |
[5] | 赵炳麟,孟曦,韩金,王婧,刘福东. 基于图结构的恶意代码同源性分析[J]. 通信学报, 2017, 38(Z2): 86-93. |
[6] | 石乐义,孙慧,崔玉文,郭宏彬,李剑蓝. 抵御DoS攻击的端信息跳变Web插件机制[J]. 通信学报, 2017, 38(Z1): 19-24. |
[7] | 王涛,陈鸿昶,程国振. 软件定义网络及安全防御技术研究[J]. 通信学报, 2017, 38(11): 133-160. |
[8] | 尹涛,李世淙,庹宇鹏,张永铮. 强抗毁性社交僵尸网络的构建及其防御[J]. 通信学报, 2017, 38(1): 97-105. |
[9] | 吕少卿,张玉清,刘东航,张光华. 在线社交网络中Spam相册检测方案[J]. 通信学报, 2016, 37(9): 75-91. |
[10] | 罗建桢,余顺争,蔡君. 基于最大似然概率的协议关键词长度确定方法[J]. 通信学报, 2016, 37(6): 119-128. |
[11] | 许艳萍,马兆丰,王中华,钮心忻,杨义先. Android智能终端安全综述[J]. 通信学报, 2016, 37(6): 169-184. |
[12] | 鲁强,刘波,胡华平. 内容共享网络中的关键问题[J]. 通信学报, 2016, 37(10): 158-171. |
[13] | 苏洁,董伟伟,许璇,刘帅,谢立鹏. 基于Dempster-Shafer理论的GHSOM入侵检测方法[J]. 通信学报, 2015, 36(Z1): 60-64. |
[14] | 周翰逊,郭薇,刘建,贾大宇. 基于马尔可夫链的网络蠕虫传播模型[J]. 通信学报, 2015, 36(5): 66-71. |
[15] | 李凤华,张翠,牛犇,李晖,华佳烽,史国振. 高效的轨迹隐私保护方案[J]. 通信学报, 2015, 36(12): 114-123. |
|