Chinese Journal of Network and Information Security ›› 2023, Vol. 9 ›› Issue (4): 104-120.doi: 10.11959/j.issn.2096-109x.2023057
• Papers • Previous Articles
Lige ZHAN1,2, Letian SHA1,2, Fu XIAO1,2, Jiankuo DONG1,2, Pinchang ZHANG1,2
Revised:
2023-03-02
Online:
2023-08-01
Published:
2023-08-01
Supported by:
Lige ZHAN, Letian SHA, Fu XIAO, Jiankuo DONG, Pinchang ZHANG. Automated Windows domain penetration method based on reinforcement learning[J]. Chinese Journal of Network and Information Security, 2023, 9(4): 104-120.
"
主机:简称 | 编号 | 操作系统 | 漏洞:漏洞类型 | 主机内域用户 | 权限 |
Web服务器:Web | 1 | Windows 2012 | CVE_2019_2725:权限获取 | User2 | Web、PC5、PC6 |
分区管理主机:PC1 | 2 | Windows 7 | CVE_2019_0708:权限获取 | Userl | DB、PC1 |
运维主机:PC2 | 3 | Windows 10 | CVE_2020_0796:权限获取 | User3 | PC2 |
后勤保障主机:PC3 | 4 | Windows 7 | CVE_2017_0143:权限获取 | User4 | PC3 |
业务主机1:PC4 | 5 | Windows 7 | CVE_2018_8120:权限提升 | User5 | PC4 |
业务主机2:PC5 | 6 | Windows 7 | CVE_2017_0143:权限获取 | User2 | Web、PC5、PC6 |
业务主机3:PC6 | 7 | Windows XP | CVE_2008_4250:权限获取 | User2 | Web、PC5、PC6 |
数据库服务器:DB | 8 | Windows 2008 | CVE_2017_7269:权限获取CVE_2016_3225:权限提升 | User1Admin | DB、PC1ALL |
邮件服务器:MAIL | 9 | Windows 2008 | CVE_2020_0787:权限提升 | User6 | |
域控制器:DC | 10 | Windows 2008 | — | Admin | ALL |
"
编号:对应状态 | 动作编号:有效动作(目标) |
1: 1L | al:漏洞利用 |
8: 1L-2L-3L-5L | CVE_2019_0708(PC1) |
15: 1L-3L-5L | |
2: 1L-2L | a2:漏洞利用 |
9: 1L-2L-4L-5U | CVE_2020_0796(PC2) |
16: 1L-3L-4L | |
3: 1L-2L-5U | a3:漏洞利用 |
10: 1L-2L-4L-5L | CVE_2017_0143(PC3) |
17: 1L-3L-4L-5L | |
4: 1L-2L-3L-4L-5U | |
11: 1L-2L-3L | a4:漏洞利用CVE-2017-7269(DB) |
18: 1L-4L | |
5: 1L-2L-3L-5U | |
12: 1L-2L-3L-4L | a5:凭据利用User1(DB) |
19:G | |
6: 1L-2L-5L | a6:漏洞利用 |
13: 1L-2L-4L | CVE_2016_3225(DB) |
7: 1L-2L-3L-4L-5L | a7:凭据利用Admin(DC) |
14: 1L-3L |
"
序号 | 攻击路径 | 累积奖励 |
1 | 1→14→15→G1 | 978.1 |
2 | 1→14→16→17→G2 | 82.4 |
3 | 1→2→11→12→5→7→G3 | 1 874.5 |
4 | 1→2→11→12→7→G4 | 1 848.6 |
5 | 1→2→11→8→G5 | 1 097.7 |
6 | 1→2→3→4→5→7→G6 | 2 934.5 |
7 | 1→2→3→4→8→G7 | 2 764.5 |
8 | 1→2→3→6→G8 | 3 804.7 |
9 | 1→2→3→9→5→7→G9 | 3 075.2 |
10 | 1→2→3→9→10→G10 | 3 552.5 |
11 | 1→2→13→12→5→7→G11 | 1 874.1 |
12 | 1→2→13→12→7→G12 | 1 848.2 |
13 | 1→18→16→17→G13 | 16.1 |
[1] | 尹圣超 . Windows域攻防关键技术研究[D]. 西安:西安电子科技大学, 2021. |
YIN Z C . Research on key technologies of attack and defense of Windows domain[D]. Xi’an:Xidian University, 2021. | |
[2] | SHEBLI H , BEHESHTI B D . A study on penetration testing process and tools[C]// 2018 IEEE Long Island Systems,Applications and Technology Conference (LISAT), 2018: 1-7. |
[3] | STEFINKO Y , PISKOZUB A , BANAKH R . Manual and automated penetration testing,benefits and drawbacks,modern tendency[C]// 2016 13th International Conference on Modern Problems of Radio Engineering,Telecommunications and Computer Science(TCSET),. 2016: 488-491. |
[4] | ALMUBAIRIK N A , WILLS G . Automated penetration testing based on a threat model[C]// 2016 11th International Conference for Internet Technology and Secured Transactions (ICITST). 2016: 413-414. |
[5] | 王硕, 汤光明, 寇广 ,等. 基于因果知识网络的攻击路径预测方法[J]. 通信学报, 2016,37(10): 188-198. |
WANG S , TANG G M , KOU G ,et al. Attack path prediction method based on causal knowledge net[J]. Journal on Communications, 2016,37(10): 188-198. | |
[6] | 王硕, 王建华, 汤光明 ,等. 一种智能高效的最优渗透路径生成方法[J]. 计算机研究与发展, 2019,56(5): 929-941. |
WANG S , WANG J H , TANG G M ,et al. Intelligent and efficient method for optimal penetration path generation[J]. Journal of Computer Research and Development, 2019,56(5): 929-941. | |
[7] | 高阳, 陈世福, 陆鑫 . 强化学习研究综述[J]. 自动化学报, 2004,30(1): 86-100. |
GAO Y , CHEN S F , LU X . Research on reinforcement learning technology:a review[J]. ACTA Automatica Sinica, 2004,30(1): 86-100. | |
[8] | WATKINS C J C H , DAYAN P . Q-learning[J]. Machine Learning, 1992,8(3-4): 279-292. |
[9] | SCHNEIER B . Attack trees-modeling security threats[J]. Drdobbs Journal, 1999,24(12): 21-29. |
[10] | SHEYNER O , HAINES J , JHA S ,et al. Automated generation and analysis of attack graphs[C]// Proceedings 2002 IEEE Symposium on Security and Privacy. 2002: 273-284. |
[11] | 叶子维, 郭渊博, 王宸东 ,等. 攻击图技术应用研究综述[J]. 通信学报, 2017,38(11): 121-132. |
YE Z W , GUO Y B , WANG C D ,et al. Survey on application of attack graph technology[J]. Journal on Communications, 2017,38(11): 121-132. | |
[12] | 陈锋, 张怡, 苏金树 ,等. 攻击图的两种形式化分析[J]. 软件学报, 2010,21(4): 838-848. |
CHEN F , ZHANG Y , SU J S ,et al. Two formal analyses of attack graphs[J]. Journal of Software, 2010,21(4): 838-848. | |
[13] | YOUSEFI M , MTETWA N , Zhang Y ,et al. A reinforcement learning approach for attack graph analysis[C]// 2018 17th IEEE International Conference On Trust,Security And Privacy In Computing And Communications/12th IEEE International Conference on Big Data Science And Engineering (TrustCom/BigDataSE). 2018: 212-217. |
[14] | 张蕾, 崔勇, 刘静 ,等. 机器学习在网络空间安全研究中的应用[J]. 计算机学报, 2018,41(9): 1943-1975. |
ZHANG L , CUI Y , LIU J ,et al. Application of machine learning in cyberspace security research[J]. Chinese Journal of Computers, 2018,41(9): 1943-1975. | |
[15] | OU X , GOVINDAVAJHALA S , APPEL A W . MulVAL:a logic-based network security analyzer[C]// USENIX Security Symposium. 2005: 113-128. |
[16] | HU Z , BEURAN R , TAN Y . Automated penetration testing using deep reinforcement learning[C]// 2020 IEEE European Symposium on Security and Privacy Workshops (EuroS & PW). 2020: 2-10. |
[17] | 刘全, 翟建伟, 章宗长 ,等. 深度强化学习综述[J]. 计算机学报, 2018,41(1): 1-27. |
LIU Q , ZHAI J W , ZHANG Z C ,et al. A survey on deep reinforcement learning[J]. Chinese Journal of Computers, 2018,41(1): 1-27. | |
[18] | MNIH V , KAVUKCUOGLU K , SILVER D ,et al. Human-level control through deep reinforcement learning[J]. Nature, 2015,518(7540): 529-533. |
[19] | 周飞燕, 金林鹏, 董军 . 卷积神经网络研究综述[J]. 计算机学报, 2017,40(6): 1229-1251. |
ZHOU F Y , JIN L P , DONG J . Review of convolutional neural network[J]. Chinese Journal of Computers, 2017,40(6): 1229-1251. | |
[20] | LI Y , LI X . Research on multi-Target network security assessment with attack graph expert system model[J]. Scientific Programming, 2021,(3): 1-11. |
[21] | OBES J L , SARRAUTE C , RICHARTE G . Attack planning in the real world[C]// Working Notes for the 2010 AAAI Workshop on Intelligent Security (SecArt). 2010:10. |
[22] | HASLUM P , LIPOVETZKY N , MAGAZZENI D ,et al. An introduction to the planning domain definition language[J]. Synthesis Lectures on Artificial Intelligence and Machine Learning, 2019,13(2): 1-187. |
[23] | 臧艺超, 周天阳, 朱俊虎 ,等. 领域独立智能规划技术及其面向自动化渗透测试的攻击路径发现研究进展[J]. 电子与信息学报, 2020,42(9): 2095-2107. |
ZANG Y C , ZHOU T Y , ZHU J H ,et al. Domain-Independent intelligent planning technology and its application to automated penetration testing oriented attack path discovery[J]. Journal of Electronics and Information Technology, 2020,42(9): 2095-2107. | |
[24] | SARRAUTE C , RICHARTE G , LUCáNGELI O J . An algorithm to find optimal attack paths in nondeterministic scenarios[C]// Proceedings of the 4th ACM Workshop on Security and Artificial Intelligence. 2011: 71-80. |
[25] | SOHN S , OH J , LEE H . Hierarchical reinforcement learning for zero-shot generalization with subtask dependencies[C]// The 32nd Conference on Neural Information Processing Systems. 2018: 7156-7166. |
[26] | DE SILVA L , PADGHAM L , SARDINA S . HTN-like solutions for classical planning problems:an application to BDI agent systems[J]. Theoretical Computer Science, 2019,763: 12-37. |
[27] | MOHR F , WEVER M , HüLLERMEIER E . ML-plan:automated machine learning via hierarchical planning[J]. Machine Learning, 2018,107(8-10): 1495-1515. |
[28] | Core impact[J]. SC Magazine:The International Journal of Computer Security, 2010(2): 21. |
[29] | HOFFMANN J . The metric-FF planning system:translating “ignoring delete lists” to numeric state variables[J]. Journal of Artificial Intelligence Research, 2011,20: 291-341. |
[30] | 范长杰 . 基于马尔可夫决策理论的规划问题的研究[D]. 合肥:中国科学技术大学, 2008. |
FAN C J . Research on planning based on Markov decision theory[D]. Hefei:University of Science and Technology of China, 2008. | |
[31] | 赵海妮, 焦健 . 基于强化学习的渗透路径推荐模型[J]. 计算机应用, 2022: 1-7. |
ZHAO H N , JIAO J . Infiltration path recommendation model based on reinforcement learning[J]. Journal of Computer Applications, 2022: 1-7. | |
[32] | DURKOTA K , LISY V . Computing optimal policies for attack graphs with action failures and costs[C]// The 7th European Starting AI Researcher Symposium. 2014: 101-110. |
[33] | SHMARYAHU D , SHANI G , HOFFMANN J ,et al. Constructing plan trees for simulated penetration testing[C]// The 26th International Conference on Automated Planning and Scheduling. 2016:121. |
[34] | KAELBLING L P , LITTMAN M L , CASSANDRA A R . Planning and acting in partially observable stochastic domains[J]. Artificial Intelligence, 1998,101(1-2): 99-134. |
[35] | LIU B B , KANG Y , JIANG X F ,et al. A fast approximation method for partially observable Markov decision processes[J]. Journal of Systems Science and Complexity, 2018,31(6): 1423-1436. |
[36] | LIU B B , KANG Y , JIANG X F ,et al. A fast approximation method for partially observable Markov decision processes[J]. Journal of Systems Science and Complexity, 2018,31(6): 1423-1436. |
[37] | 王作广, 魏强, 刘雯雯 . 基于攻击树与 CVSS 的工业控制系统风险量化评估[J]. 计算机应用研究, 2016,33(12): 3785-3790. |
WANG ZG , WEI Q , LIU W W . Quantitative risk assessment of industrial control systems based on attack trees and CVSS[J]. Application Research of Computers, 2016,33(12): 3785-3790. | |
[38] | 刘奇旭, 张翀斌, 张玉清 ,等. 安全漏洞等级划分关键技术研究[J]. 通信学报, 2012,33(S1): 79-87. |
LIU Q X , ZHANG C B , ZHANG Y Q ,et al. Research on the key technology of security vulnerability threat classification[J]. Journal on Communications, 2012,33(S1): 79-87. | |
[39] | AonCyberLabs. Windows-exploit-suggester[EB]. |
[40] | byt3bl33d3r. DeathStar[EB]. |
[1] | Xiaoyan QIN, Yuhan LIU, Yunlong XU, Bin LI. Function approximation method based on weights gradient descent in reinforcement learning [J]. Chinese Journal of Network and Information Security, 2023, 9(4): 16-28. |
[2] | Tian XIAO, Zhihao JIANG, Peng TANG, Zheng HUANG, Jie GUO, Weidong QIU. High-performance directional fuzzing scheme based on deep reinforcement learning [J]. Chinese Journal of Network and Information Security, 2023, 9(2): 132-142. |
[3] | Wenfu LIU, Jianmin PANG, Xin ZHOU, Nan LI, Feng YUE. Research on network risk assessment based on attack graph of expected benefits-rate [J]. Chinese Journal of Network and Information Security, 2022, 8(4): 87-97. |
[4] | Cheng SUN, Hao HU, Yingjie YANG, Hongqi ZHANG. Prediction method of 0day attack path based on cyber defense knowledge graph [J]. Chinese Journal of Network and Information Security, 2022, 8(1): 151-166. |
[5] | Tangwei1 XU,Hailu ZHANG,Chuhuan LIU,Liang XIAO,Zhenmin ZHU. Reinforcement learning based group key agreement scheme with reduced latency for VANET [J]. Chinese Journal of Network and Information Security, 2020, 6(5): 119-125. |
[6] | Yuyang ZHOU, Guang CHENG, Chunsheng GUO. Risk assessment method for network attack surface based on Bayesian attack graph [J]. Chinese Journal of Network and Information Security, 2018, 4(6): 11-22. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||
|