网络与信息安全学报 ›› 2022, Vol. 8 ›› Issue (6): 123-134.doi: 10.11959/j.issn.2096-109x.2022085

• 学术论文 • 上一篇    下一篇

基于符号执行和N-scope复杂度的代码混淆度量方法

肖玉强, 郭云飞, 王亚文   

  1. 信息工程大学,河南 郑州 450001
  • 修回日期:2022-05-11 出版日期:2022-12-15 发布日期:2023-01-16
  • 作者简介:肖玉强(1997- ),男,吉林辽源人,信息工程大学硕士生,主要研究方向为网络空间安全、代码混淆、机器学习
    郭云飞(1963- ),男,河南郑州人,信息工程大学教授、博士生导师,主要研究方向为云安全、电信网络安全、网络安全
    王亚文(1990- ),男,河南郑州人,信息工程大学讲师,主要研究方向为云计算、入侵容忍、网络安全
  • 基金资助:
    国家重点研发计划(2021YFB1006200);国家重点研发计划(2021YFB1006201);国家自然科学基金(62072467)

Metrics for code obfuscation based on symbolic execution and N-scope complexity

Yuqiang XIAO, Yunfei GUO, Yawen WANG   

  1. Information Engineering University, Zhengzhou 450001, China
  • Revised:2022-05-11 Online:2022-12-15 Published:2023-01-16
  • Supported by:
    The National Key R&D Program of China(2021YFB1006200);The National Key R&D Program of China(2021YFB1006201);The National Natural Science Foundation of China(62072467)

摘要:

代码混淆可有效对抗逆向工程等各类 MATE 攻击威胁,作为攻击缓和性质的内生安全技术发展较为成熟,对代码混淆效果的合理度量具有重要价值。代码混淆度量研究相对较少,针对代码混淆弹性的度量方法与泛化性、实用性度量方法相对缺乏。符号执行技术广泛应用于反混淆攻击,其生成遍历程序完整路径输入测试集的难度可为混淆弹性度量提供参考,然而基于程序嵌套结构的对抗技术可显著降低符号执行效率,增加其混淆弹性参考误差。针对上述问题,提出结合符号执行技术和N-scope复杂度的代码混淆度量方法,该方法首先基于程序符号执行时间定义程序混淆弹性;其次提出适配符号执行的N-scope复杂度,定义程序混淆强度同时增强符号执行对多层嵌套结构程序的混淆弹性度量鲁棒性;进而提出结合动态分析与静态分析的混淆效果关联性分析,通过对程序进行符号执行与控制流图提取量化混淆效果。面向 C 程序构建了该度量方法的一种实现框架并验证,实验对3个公开程序集及其混淆后程序集约4 000个程序进行混淆效果度量,度量结果表明,提出的度量方法在较好地刻画混淆效果的同时拥有一定的泛化能力与实用价值;模拟真实混淆应用场景给出了该度量方法的使用样例,为混淆技术使用人员提供有效的混淆技术度量与技术配置参考。

关键词: 代码混淆, 混淆度量, 符号执行, N-scope

Abstract:

Code obfuscation has been well developed as mitigated endogenous security technology, to effectively resist MATE attacks (e.g.reverse engineering).And it also has important value for the reasonable metrics of code obfuscation effect.Since symbolic execution is widely used in anti-obfuscation attacks, metrics for code obfuscation resilience can refer to the efforts of generating input test set for executing all program paths.However, some adversarial techniques could reduce the symbol execution efficiency significantly based on the nested structure of the program and increase the error of the resilience reference.To solve the above problems, a metrics for code obfuscation was proposed based on symbolic execution and N-scope complexity.The obfuscation resilience was defined with symbolic execution time and obfuscation potency was defined based on the proposed N-scope complexity for better robustness in measuring the resilience of multi-nested structure programs.Furthermore, the correlation analysis of obfuscation effect was proposed and the effect was quantified by symbolic execution and control flow diagram extraction of programs.Over 4000 obfuscated programs from 3 open-sourced assemblies were evaluated with proposed metrics in the experiment, which indicated the generalization performance and practicality of the metrics.And an example of this metrics application was presented in a simulated obfuscation scenario which provided references of obfuscation technology metrics and obfuscation configuration for obfuscation users.

Key words: code obfuscation, obfuscation metrics, symbolic execution, N-scope

中图分类号: 

No Suggested Reading articles found!