网络与信息安全学报 ›› 2023, Vol. 9 ›› Issue (4): 104-120.doi: 10.11959/j.issn.2096-109x.2023057

• 学术论文 • 上一篇    

基于强化学习的自动化Windows域渗透方法

占力戈1,2, 沙乐天1,2, 肖甫1,2, 董建阔1,2, 张品昌1,2   

  1. 1 南京邮电大学计算机学院,江苏 南京 210023
    2 江苏省无线传感网高技术研究重点实验室,江苏 南京 210023
  • 修回日期:2023-03-02 出版日期:2023-08-01 发布日期:2023-08-01
  • 作者简介:占力戈(1996- ),男,福建南平人,南京邮电大学硕士生,主要研究方向为网络攻防、渗透测试
    沙乐天(1985- ),男,江苏徐州人,博士,南京邮电大学副教授,主要研究方向为网络安全、物联网攻防
    肖甫(1980- ),男,湖南邵阳人,博士,南京邮电大学教授、博士生导师,主要研究方向为传感网、物联网
    董建阔(1992- ),男,河北邢台人,博士,南京邮电大学讲师,主要研究方向为密码工程、高性能并行计算
    张品昌(1985- ),男,安徽临泉人,博士,南京邮电大学讲师,主要研究方向为网络安全、物理层认证及卫星互联网安全
  • 基金资助:
    国家重点研发计划(2018YFB0803400);国家杰出青年科学基金(62125203);国家自然科学基金(62072253)

Automated Windows domain penetration method based on reinforcement learning

Lige ZHAN1,2, Letian SHA1,2, Fu XIAO1,2, Jiankuo DONG1,2, Pinchang ZHANG1,2   

  1. 1 College of Computer, Nanjing University of Posts and Telecommunications, Nanjing 210023, China
    2 Jiangsu Provincial Key Laboratory of Wireless Sensor Network High Technology Research, Nanjing 210023, China
  • Revised:2023-03-02 Online:2023-08-01 Published:2023-08-01
  • Supported by:
    The National Key Research and Development Program of China(2018YFB0803400);The National Science Fund for Distinguished Young Scholars of China(62125203);The National Natural Science Foundation of China(62072253)

摘要:

Windows 域为用户之间的资源共享及信息交互提供统一的系统服务,在便利内网管理的同时带来了巨大的安全隐患。近年来,针对域控制器的各式攻击层出不穷,实现自动化渗透能够灵活检测 Windows域中存在的漏洞威胁,保障办公网络安全稳定地持久运行,其核心是高效挖掘环境内可行的攻击路径。为此,将渗透测试过程进行强化学习建模,通过智能体与域环境的真实交互发现漏洞组合,进而验证有效的攻击序列;基于主机对渗透进程的贡献差异,减少强化学习模型中非必要的状态与动作,优化路径选择策略,提升实际攻击效率;使用状态动作删减、探索策略优化的 Q 学习算法筛选最优攻击路径,自动验证域环境中所有可能的安全隐患,为域管理员提供防护依据。实验针对典型内网业务场景展开测试,从生成的13 种高效攻击路径中筛选最优路径,通过与相关研究成果对比,突出了所提方法在域控权限获取、主机权限获取、攻击步长、收敛性以及时间代价等方面的性能优化效果。

关键词: Windows域, 渗透测试, 强化学习, 攻击路径

Abstract:

Windows domain provides a unified system service for resource sharing and information interaction among users.However, this also introduces significant security risks while facilitating intranet management.In recent years, intranet attacks targeting domain controllers have become increasingly prevalent, necessitating automated penetration testing to detect vulnerabilities and ensure the ongoing maintenance of office network operations.Then efficient identification of attack paths within the domain environment is crucial.The penetration process was first modeled using reinforcement learning, and attack paths were then discovered and verified through the interaction of the model with the domain environment.Furthermore, unnecessary states in the reinforcement learning model were trimmed based on the contribution differences of hosts to the penetration process, aiming to optimize the path selection strategy and improve the actual attack efficiency.The Q-learning algorithms with solution space refinement and exploration policy optimization were utilized to filter the optimal attack path.By employing this method, all security threats in the domain can be automatically verified, providing a valuable protection basis for domain administrators.Experiments were conducted on typical Windows domain scenarios, and the results show that the optimal path is selected from the thirteen efficient paths generated by the proposed method, while also providing better performance optimization in terms of domain controller intrusion, domain host intrusion, attack steps, convergence, and time cost compared to other approaches.

Key words: Windows domain, penetration testing, reinforcement learning, attack path

No Suggested Reading articles found!