基于Markov时间博弈的移动目标防御最优策略选取方法

doi:10.11959/j.issn.1000-436x.2020003

通信学报 ›› 2020, Vol. 41 ›› Issue (1): 42-52.doi: 10.11959/j.issn.1000-436x.2020003

基于Markov时间博弈的移动目标防御最优策略选取方法

谭晶磊^1,²,张恒巍¹,张红旗^1,²,金辉^1,²,雷程^1,²

¹ 信息工程大学三院，河南郑州 450001
² 河南省信息安全重点实验室，河南郑州 450001

修回日期:2019-09-21 出版日期:2020-01-25 发布日期:2020-02-11
作者简介:谭晶磊（1994- ），男，山东章丘人，信息工程大学博士生，主要研究方向为网络信息安全、移动目标防御、攻防博弈对抗等|张恒巍（1978- ），男，河南洛阳人，博士，信息工程大学副教授，主要研究方向为网络安全与攻防对抗、信息安全风险评估|张红旗（1962- ），男，河北遵化人，博士，信息工程大学教授、博士生导师，主要研究方向为网络安全、移动目标防御、等级保护和信息安全管理等|金辉（1988- ），男，北京人，信息工程大学硕士生，主要研究方向为网络信息安全等|雷程（1989- ），男，北京人，信息工程大学博士生，主要研究方向为网络信息安全、移动目标防御、数据安全交换和网络流指纹等
基金资助:
国家重点研发计划基金资助项目(2016YFF0204002);国家重点研发计划基金资助项目(2016YFF0204003);国家自然科学基金资助项目(61902427)

Optimal strategy selection approach of moving target defense based on Markov time game

Jinglei TAN^1,²,Hengwei ZHANG¹,Hongqi ZHANG^1,²,Hui JIN^1,²,Cheng LEI^1,²

¹ Department of Three,Information Engineering University,Zhengzhou 450001,China
² Henan Key Laboratory of Information Security,Zhengzhou 450001,China

Revised:2019-09-21 Online:2020-01-25 Published:2020-02-11
Supported by:
The National Key Research and Development Program of China(2016YFF0204002);The National Key Research and Development Program of China(2016YFF0204003);The National Natural Science Foundation of China(61902427)

摘要/Abstract

摘要：

针对现有博弈模型难以有效建模网络攻防对抗动态连续特性的问题，提出了一种基于 Markov 时间博弈的移动目标防御最优策略选取方法。在分析移动目标攻防对抗过程的基础上，构建了移动目标攻防策略集合，利用时间博弈刻画了单阶段移动目标防御过程的动态性，利用 Markov 决策过程描述了多阶段移动目标防御状态转化的随机性。同时，将攻防双方对资源脆弱性抽象为对攻击面控制权的交替，从而有效保证了博弈模型的通用性。在此基础上，分析并证明了均衡的存在性，设计了最优策略选取算法。最后，通过应用实例验证了所提模型的实用性和算法的有效性。

关键词: 时间博弈, 移动目标攻击, 移动目标防御, 最优策略选取, Markov决策

Abstract:

For the problem that the existed game model was challenging to model the dynamic continuous characteristics of network attack and defense confrontation effectively,a method based on Markov time game was proposed to select the optimal strategy for moving target defense.Based on the analysis of the attack and defense confrontation process of moving targets,the set of moving target attack and defense strategies was constructed.The dynamics of the single-stage moving target defense process was described by time game.The randomness of multi-stage moving target defense state transformation was described by Markov decision process.At the same time,by abstracting the use of resource vulnerability by attack-defense participants as the alternation of the control of the attack surface,the versatility of the game model was effectively guaranteed.On this basis,the existence of equilibrium was analyzed and proved,and the optimal strategy selection algorithm was designed.Finally,the practicality of the constructed model and the effectiveness of the algorithm are verified by an application example.

Key words: time game, moving target attack, moving target defense, optimal strategy selection, Markov decision

中图分类号:

TN918.1

谭晶磊,张恒巍,张红旗,金辉,雷程. 基于Markov时间博弈的移动目标防御最优策略选取方法[J]. 通信学报, 2020, 41(1): 42-52.

Jinglei TAN,Hengwei ZHANG,Hongqi ZHANG,Hui JIN,Cheng LEI. Optimal strategy selection approach of moving target defense based on Markov time game[J]. Journal on Communications, 2020, 41(1): 42-52.

图/表 12

表1

图1

表2

图2

图3

表3

表4

图4

表5

不同网络状态下的移动目标攻防策略"

网络状态	攻防策略
S₁	$P_{MTA}^{1} = {P_{{MTA}_{1}}, P_{{MTA}_{2}}, P_{{MTA}_{3}}}$
	$P_{MTD}^{1} = {IDS, P_{{MTD}_{1}}, P_{{MTD}_{3}}}$
S₂	$P_{MTA}^{2} = {P_{{MTA}_{2}}, P_{{MTA}_{4}}, P_{{MTA}_{5}}}$
	$P_{MTD}^{2} = {patch upgrade,$
	$P_{{MTD}_{3}}, P_{{MTD}_{1}} + P_{{MTD}_{3}}}$
S₃	$P_{MTA}^{3} = {P_{{MTA}_{1}}, P_{{MTA}_{6}}, P_{{MTA}_{7}}}$
	$P_{MTD}^{3} = {P_{{MTD}_{1}}, P_{{MTD}_{3}}, P_{{MTD}_{2}} + P_{{MTD}_{3}}}$
S₄	$P_{MTA}^{4} = {P_{{MTA}_{3}}, P_{{MTA}_{7}}, P_{{MTA}_{8}}}$
	$P_{MTD}^{4} = {P_{{MTD}_{2}}, IDS, close service}$

表5

表6

网络系统状态转移概率"

网络状态	转移概率
S₁	$f_{12} = 0.25 (P_{{MTA}_{1}}, P_{{MTD}_{1}})$ ， $f_{13} = 0.36 (P_{{MTA}_{2}}, P_{{MTD}_{1}})$ ， $f_{14} = 0.85 (P_{{MTA}_{3}}, P_{{MTD}_{1}})$
S₂	$f_{21} = 0.9 (P_{{MTA}_{1}}, P_{{MTD}_{2}})$ ， $f_{23} = 0.16 (P_{{MTA}_{2}}, P_{{MTD}_{2}})$ ， $f_{24} = 0.38 (P_{{MTA}_{3}}, P_{{MTD}_{2}})$
S₃	$f_{32} = 0.9 (P_{{MTA}_{1}}, P_{{MTD}_{3}})$
S₄	$f_{43} = 0.9 (P_{{MTA}_{1}}, P_{{MTD}_{4}})$ ， $f_{42} = 0.8 (P_{{MTA}_{2}}, P_{{MTD}_{4}})$

表6

表7

移动目标攻防策略收益矩阵"

网络状态	MTA收益	MTD收益
S₁	$[\begin{matrix} 21 & 42 & 22 \\ 19 & 15 & 18 \\ 0 & 0 & 0 \end{matrix}]$	$[\begin{matrix} - 21 & - 44 & - 22 \\ - 19 & - 15 & - 18 \\ - 36 & - 11 & - 6 \end{matrix}]$
S₂	$[\begin{matrix} 53 & 62 & 34 \\ 16 & 27 & 21 \\ 15 & 17 & 9 \end{matrix}]$	$[\begin{matrix} - 53 & - 62 & - 34 \\ - 16 & - 27 & - 21 \\ - 16 & - 17 & - 9 \end{matrix}]$
S₃	$[\begin{matrix} 36 & 21 & 18 \\ 23 & 26 & 21 \\ 12 & 7 & 13 \end{matrix}]$	$[\begin{matrix} - 36 & - 21 & - 18 \\ - 23 & - 26 & - 21 \\ - 12 & - 7 & - 13 \end{matrix}]$
S₄	$[\begin{matrix} 35 & 18 & 0 \\ 27 & 13 & 0 \\ 6 & 20 & 0 \end{matrix}]$	$[\begin{matrix} - 35 & - 18 & 0 \\ - 27 & - 13 & 0 \\ - 6 & - 20 & 0 \end{matrix}]$

表7

表8

参考文献 23

[1]	MITROPOULOS D , LOURIDAS P , POLYCHRONAKIS M ,et al. Defending against web application attacks:approaches,challenges and implications[J]. IEEE Transactions on Dependable and Secure Computing, 2017:1.
[2]	ZHENG J , NAMIN A S . A survey on the moving target defense strategies:an architectural perspective[J]. Journal of Computer Science and Technology, 2019,34(1): 207-233.
[3]	CAI G L , WANG B S , XING Q Q . Game theoretic analysis for the mechanism of moving target defense[J]. Frontiers of Information Technology ＆ Electronic Engineering, 2017,18(12): 2017-2034.
[4]	姜伟, 方滨兴, 田志宏 . 基于攻防博弈模型的网络安全测评和最优主动防御[J]. 计算机学报, 2013,32(4): 818-827.
	JIANG W , FANG B X , TIAN Z H . Defense strategies selection based on attack-defense game model[J]. Chinese Journal of Computers, 2013,47(12): 818-827.
[5]	林旺群, 王慧, 刘家红 . 基于非合作动态博弈的网络安全主动防御技术研究[J]. 计算机研究与发展, 2013,48(2): 306-316.
	LIN W Q , WANG H , LIU J H . Research on active defense technology in network security based on non-cooperative dynamic game theory[J]. Journal of Computer Research and Development, 2013,48(2): 306-316.
[6]	MANADHATA P K . Game theoretic approaches to attack surface shifting[M]. New York: SpringerPress, 2013: 1-13.
[7]	VADLAMUDI S G , SENGUPTA S , TAGUINOD M ,et al. Moving target defense for web applications using Bayesian Stackelberg games[C]// The 2016 International Conference on Autonomous Agents＆ Multiagent Systems. International Foundation for Autonomous Agents and Multiagent Systems, 2016: 1377-1378.
[8]	LEI C , ZHANG H Q , WAN L M ,et al. Incomplete information Markov game theoretic approach to strategy generation for moving target defense[J]. Computer Communications, 2018,116: 184-199.
[9]	MALEKI H , VALIZADEH M H , KOCH W ,et al. Markov modeling of moving target defense games[J]. Journal of Cryptology, 2016: 47-83.
[10]	JAJODIA S , GHOSH A K , SWARUP V ,et al. Moving target defense:creating asymmetric uncertainty for cyber threats[J]. Springer Ebooks, 2011，54.
[11]	LEI C , ZHANG H Q , WANG L M ,et al. Incomplete information Markov game theoretic approach to strategy generation for moving target defense[J]. 2018,116: 184-199.
[12]	ZHENG J J , NAMIN A S . A survey on the moving target defense strategies:an architectural perspective[J]. Journal of Computer Science and Technology, 2019,34(1): 207-233.
[13]	谭晶磊, 张红旗, 雷程 ,等. 面向SDN的移动目标防御技术研究进展[J]. 网络与信息安全学报, 2018,4(7): 1-12.
	TAN J L , ZHANG H Q , LEI C ,et al. Research progress on moving target defense for SDN[J]. Chinese Journal of Network and Information Security, 2018,4(7): 1-12.
[14]	DIJK M V , ARI JUELS , ALINA OPREA ,et al. FlipIt:the game of“stealthy takeover”[J]. Journal of Cryptology, 2013,26(4): 655-713.
[15]	ZHENG J , SIAMI NAMIN A . A Markov decision process to determine optimal policies in moving target[C]// The 2018 ACM SIGSAC Conference on Computer and Communications Security. ACM, 2018: 2321-2323.
[16]	刘江, 张红旗, 刘艺 . 基于不完全信息动态博弈的动态目标防御最优策略选取研究[J]. 电子学报, 2018,46(1): 82-89.
	LIU J , ZHANG H Q , LIU Y . Research on optimal selection of moving target defense policy based on dynamic game with incomplete information[J]. Acta Electronica Sinica, 2018,46(1): 82-89.
[17]	LEI C , MA D H , ZHANG H Q . Optimal strategy selection for moving target defense based on Markov game[J]. IEEE Access, 2017,PP(99):1.
[18]	BORKOVSKY R N , DORASZELSKI U , KRYUKOV Y . A user’ s guide to solving dynamic stochastic games using the homotopy method[J]. Operation Research, 2015,58(4): 1116-1132.
[19]	CHEN M , SAAD W , YIN C . Virtual reality over wireless networks:quality-of-service model and learning-based resource management[J]. IEEE Transactions on Communications, 2018,66(11): 5621-5635.
[20]	NILIM A , GHAOUI L E . Robust control of Markov decision processes with uncertain transition matrices[J]. Operations Research, 2016,53(5): 780-798.
[21]	SULEIMAN R . On gamesmen and fair men:explaining fairness in non-cooperative bargaining games[J]. Royal Society Open Science, 2018,5(2):171709.
[22]	MANADHATA P K . Game theoretic approaches to attack surface shifting[M]. New York: SpringerPress, 2013: 1-13.
[23]	CLARK A , SUN K , BUSHNELL L ,et al. A game-theoretic approach to IP address randomization in decoy-based cyber defense[C]// International Conference on Decision and Game Theory for Security. Springer, 2015: 3-21.

移动目标攻击策略	具体方法
多态MTA	变换恶意软件签名
自修改MTA	动态变换恶意软件代码
混淆MTA	隐藏恶意活动
自加密MTA	变换恶意软件签名，并隐藏恶意代码和数据
反虚拟机/反沙箱MTA	变换追踪环境中的行为，规避自动取证分析
反调试MTA	变换追踪环境中的行为，规避自动/手动调查
目标漏洞利用MTA	变换参数和签名，规避自动/手动调查
行为改变MTA	执行前等待真实的用户活动

序号	分类名称	具体方法
1	系统层MTD软件MTD	变换应用程序、操作系统、数据
	硬件MTD	变换处理器
	MAC-MTD	变换MAC地址
	IP-MTD	变换IP地址
	Procotol -MTD	变换协议
2	网络层MTD路径MTD	变换路径
	OS-MTD	变换操作系统
	Finger-MTD	变换指纹
	Port-MTD	变换端口

方法	收益量化	动态性	博弈类型	均衡求解	最优选取算法
文献[22]方法	历史数据	单阶段	静态博弈	简单	未给出
文献[23]方法	历史数据	多阶段	动态博弈	详细	给出
文献[8]方法	历史数据	多阶段	Markov矩阵博弈	详细	给出
本文方法	历史数据+时间因素	多阶段	Markov时间博弈	详细	给出

端点名称	移动目标攻击者	应用服务器	LDAP服务器	FTP服务器	Linux数据库
移动目标攻击者	local	IIS	—	—	—
应用服务器	—	local	all	all	Squid LICQ
LDAP服务器	—	—	local	all	Squid LICQ
FTP服务器	—	IIS	all	local	—
Linux数据库	—	—	all	all	local

网络阶段状态	MTA策略	MTD策略	MTA收益	MTD收益
S₁	0.31,0.29,0.4]	0.07,0.48,0.45]	187.9	-187.9
S₂	0.32,0.32,0.36]	0.17,0.39,0.44]	107.3	-107.3
S₃	0.35,0.38,0.27]	0.12,0.36,0.52]	473.5	-473.5
S₄	0.12,0.42,0.46]	0.33,0.09,0.58]	601.3	-601.3

基于Markov时间博弈的移动目标防御最优策略选取方法

Optimal strategy selection approach of moving target defense based on Markov time game

在线阅读

PDF下载

可视化

摘要/Abstract

引用本文

使用本文

图/表 12

参考文献 23

相关文章 7

Metrics

推荐阅读 0

[1]	徐潇雨, 胡浩, 张红旗, 刘玉岭. 基于深度确定性策略梯度的随机路由防御方法[J]. 通信学报, 2021, 42(6): 41-51.
[2]	陈福才,何威振,程国振,霍树民,周大成. 基于DPDK的内网动态网关关键技术设计[J]. 通信学报, 2020, 41(6): 139-151.
[3]	蒋侣,张恒巍,王晋东. 基于信号博弈的移动目标防御最优策略选取方法[J]. 通信学报, 2019, 40(6): 128-137.
[4]	马多贺,李琼,林东岱. 基于POF的网络窃听攻击移动目标防御方法[J]. 通信学报, 2018, 39(2): 73-87.
[5]	雷程,马多贺,张红旗,韩琦,杨英杰. 基于最优路径跳变的网络移动目标防御技术[J]. 通信学报, 2017, 38(3): 133-143.
[6]	胡毅勋,郑康锋,杨义先,钮心忻. 基于OpenFlow的网络层移动目标防御方案[J]. 通信学报, 2017, 38(10): 102-112.
[7]	雷程,马多贺,张红旗,杨英杰,王淼. 基于变点检测的网络移动目标防御效能评估方法[J]. 通信学报, 2017, 38(1): 126-140.