网络与信息安全学报 ›› 2023, Vol. 9 ›› Issue (6): 166-175.doi: 10.11959/j.issn.2096-109x.2023091

• 学术论文 • 上一篇    

基于攻击图和深度Q学习网络的自动化安全分析与渗透测试模型

樊成, 胡国庆, 丁涛杰, 张展华   

  1. 中国人民解放军31681部队,甘肃 天水 741000
  • 修回日期:2023-10-28 出版日期:2023-12-01 发布日期:2023-12-01
  • 作者简介:樊成(1993- ),男,甘肃庆阳人,中国人民解放军 31681 部队研究实习员,主要研究方向为网络信息系统安全、数据挖掘和人工智能安全
    胡国庆(1982- ),男,四川泸州人,中国人民解放军 31681 部队助理研究员,主要研究方向为电子信息、网络安全
    丁涛杰(1990- ),男,浙江诸暨人,中国人民解放军 31681 部队研究实习员,主要研究方向为普适计算、网络安全
    张展华(2000- ),男,河北磁县人,中国人民解放军 31681 部队研究实习员,主要研究方向为网电对抗、人工智能安全
  • 基金资助:
    国家自然科学基金(61902426)

Autonomous security analysis and penetration testing model based on attack graph and deep Q-learning network

Cheng FAN, Guoqing HU, Taojie DING, Zhanhua ZHANG   

  1. The 31681 Unit of PLA, Tianshui 741000, China
  • Revised:2023-10-28 Online:2023-12-01 Published:2023-12-01
  • Supported by:
    TheNational Natural Science Foundation of China(61902426)

摘要:

随着网络技术的快速发展和广泛应用,网络安全问题日益突出,渗透测试成为评估和提升网络安全性的重要手段。然而,传统的人工渗透测试方法效率较低,且易受到人为错误和测试人员技能水平的影响,造成测试结果不确定性大、评估效果不理想等问题。针对以上人工渗透测试中存在的问题,提出了基于攻击图和深度 Q 学习网络(DQN,deep Q-learning network)的自动化安全分析与渗透测试(ASAPT, autonomous security analysis and penetration testing)模型。该模型由训练数据构建和模型训练两部分构成。在训练数据构建阶段,采用攻击图对目标网络进行威胁建模,将网络中存在的漏洞和攻击者可能的攻击路径转化为节点、边,随后结合CVSS(common vulnerability scoring system)漏洞信息库构建对应的“状态-动作”转移矩阵,用以描述攻击者在不同状态下的攻击行为和转移概率,并全面反映攻击者的攻击能力和网络的安全状况。为进一步降低计算复杂度,创新性地使用深度优先搜索算法对转移矩阵进行简化,查找并保留所有能达到最终目标的攻击路径,以便于后续模型训练。在模型训练阶段,使用基于 DQN 的深度强化学习算法对渗透测试中的最优攻击路径进行确定,该算法通过不断与环境交互、更新 Q 值函数,从而逐步优化攻击路径选择。仿真结果表明,ASAPT 模型在最优路径寻找方面准确率可达 84%,收敛速度快,并且在面对大规模网络环境时,相较于传统Q学习具有更好的适应性,能够为实际的渗透测试提供指导。

关键词: 自动化渗透测试, 强化学习, 攻击图, 深度Q学习网络

Abstract:

With the continuous development and widespread application of network technology, network security issues have become increasingly prominent.Penetration testing has emerged as an important method for assessing and enhancing network security.However, traditional manual penetration testing methods suffer from inefficiency,human error, and tester skills, leading to high uncertainty and poor evaluation results.To address these challenges, an autonomous security analysis and penetration testing framework called ASAPT was proposed, based on attack graphs and deep Q-learning networks (DQN).The ASAPT framework was consisted of two main components:training data construction and model training.In the training data construction phase, attack graphs were utilized to model the threats in the target network by representing vulnerabilities and possible attacker attack paths as nodes and edges.By integrating the common vulnerability scoring system (CVSS) vulnerability database, a “state-action”transition matrix was constructed, which depicted the attacker’s behavior and transition probabilities in different states.This matrix comprehensively captured the attacker’s capabilities and network security status.To reduce computational complexity, a depth-first search (DFS) algorithm was innovatively applied to simplify the transition matrix, identifying and preserving all attack paths that lead to the final goal for subsequent model training.In the model training phase, a deep reinforcement learning algorithm based on DQN was employed to determine the optimal attack path during penetration testing.The algorithm interacted continuously with the environment, updating the Q-value function to progressively optimize the selection of attack paths.Simulation results demonstrate that ASAPT achieves an accuracy of 84% in identifying the optimal path and exhibits fast convergence speed.Compared to traditional Q-learning, ASAPT demonstrates superior adaptability in dealing with large-scale network environments, which could provide guidance for practical penetration testing.

Key words: autonomous penetration testing, reinforcement learning, attack graph, deep Q-learning network

中图分类号: 

No Suggested Reading articles found!