基于深度确定性策略梯度的随机路由防御方法

doi:10.11959/j.issn.1000-436x.2021093

通信学报 ›› 2021, Vol. 42 ›› Issue (6): 41-51.doi: 10.11959/j.issn.1000-436x.2021093

基于深度确定性策略梯度的随机路由防御方法

徐潇雨¹^,², 胡浩¹^,², 张红旗¹^,², 刘玉岭³

¹ 信息工程大学密码工程学院，河南郑州 450001
² 河南省信息安全重点实验室，河南郑州 450001
³ 中国科学院信息工程研究所，北京 100190

修回日期:2021-03-31 出版日期:2021-06-25 发布日期:2021-06-01
作者简介:徐潇雨（1992− ），男，江苏连云港人，信息工程大学博士生，主要研究方向为主动防御和智能决策
胡浩（1989− ），男，安徽池州人，博士，信息工程大学讲师，主要研究方向为网络安全态势感知
张红旗（1962− ），男，河北遵化人，博士，信息工程大学教授、博士生导师，主要研究方向为网络安全、风险评估、等级保护和信息安全管理等
刘玉岭（1982− ），男，山东济阳人，博士，中国科学院信息工程研究所副教授，主要研究方向为网络安全测评和等级保护
基金资助:
国家自然科学基金资助项目(61902427);国家自然科学基金资助项目(61802404)

Random routing defense method based on deep deterministic policy gradient

Xiaoyu XU¹^,², Hao HU¹^,², Hongqi ZHANG¹^,², Yuling LIU³

¹ Cryptography Engineering Institute, Information Engineering University, Zhengzhou 450001, China
² Henan Key Laboratory of Information Security, Zhengzhou 450001, China
³ Institute of Information Engineering, Chinese Academy of Sciences, Beijing 100190, China

Revised:2021-03-31 Online:2021-06-25 Published:2021-06-01
Supported by:
The National Natural Science Foundation of China(61902427);The National Natural Science Foundation of China(61802404)

摘要/Abstract

摘要：

针对现有随机路由防御方法对数据流拆分粒度过粗、对合法的服务质量（QoS）保障效果不佳、对抗窃听攻击的安全性有待提升等问题，提出一种基于深度确定性策略梯度（DDPG）的随机路由防御方法。通过带内网络遥测（INT）技术实时监测并获取网络状态；通过DDPG方法生成兼顾安全性和QoS需求的随机路由方案；通过 P4 框架下的可编程交换机执行随机路由方案，实现了数据包级粒度的随机路由防御。实验表明，与其他典型的随机路由方法相比，所提方法在对抗窃听攻击中的安全性和对网络整体QoS的保障效果均有提升。

关键词: 随机路由, 深度确定性策略梯度, 窃听攻击, 移动目标防御

Abstract:

To solve the problem of the existing routing shuffling defenses, such as too coarse data flow splitting granularity, poor protection effect on legitimate QoS and the security against eavesdropping attacks needed to be improved, a random routing defense method based on DDPG was proposed.INT was used to monitor and obtain the network state in real time, DDPG algorithm was used to generate random routing scheme considering both security and QoS requirements, random routing scheme was implemented with programmable switch under P4 framework to realize real-time routing shuffling with packet level granularity.Experiment results show that compared with other typical routing shuffling defense methods, the security and QoS protection effect of the proposed method are improved.

Key words: random routing, deep deterministic policy gradient, eavesdropping attack, moving target defense

中图分类号:

TP939

徐潇雨, 胡浩, 张红旗, 刘玉岭. 基于深度确定性策略梯度的随机路由防御方法[J]. 通信学报, 2021, 42(6): 41-51.

Xiaoyu XU, Hao HU, Hongqi ZHANG, Yuling LIU. Random routing defense method based on deep deterministic policy gradient[J]. Journal on Communications, 2021, 42(6): 41-51.

图/表 15

图1

图2

表1

系统参数与含义"

参数	含义
F	网络中的一条数据流
RRS	数据流F 的随机路由方案
Vol^t	第t防御周期中传输的流量大小
$N_{dsp}^{t}$	第t防御周期中传输的对时延敏感的数据包总量
${Td}_{dsp}^{t}$	第t防御周期中传输数据包的总时延
S_t	第t防御周期获取的状态（Sate）
A_t	第t防御周期执行的动作（Action）
R_t	第t防御周期获得的奖励（Reward）
Th	差异性约束单个门限值
th	差异性约束总体门限值

表1

图3

图4

图5

表2

表3

表4

图7

表5

图8

图9

表6

表7

参考文献 22

[1]	赛门铁 . 2018 年安全威胁趋势预测[J]. 网络安全和信息化, 2018(1): 103-105.
	EC . Security threat trend forecast in 2018[J]. Security ＆Informatization, 2018(1): 103-105.
[2]	JAJODIA S , GHOSH A K , SWARUP V ,et al. Moving target defense[M]. New York: Springer, 2011.
[3]	YANG W , ZHENG Z Q , CHEN G R ,et al. Security analysis of a distributed networked system under eavesdropping attacks[J]. IEEE Transactions on Circuits and Systems II:Express Briefs, 2020,67(7): 1254-1258.
[4]	GURUNG S , CHAUHAN S . A survey of black-hole attack mitigation techniques in MANET:merits,drawbacks,and suitability[J]. Wireless Networks, 2020,26(3): 1981-2011.
[5]	SINGH M P , BHANDARI A . New-flow based DDoS attacks in SDN:Taxonomy,rationales,and research challenges[J]. Computer Communications, 2020,154: 509-527.
[6]	LILLICRAP T P , HUNT J J , PRITZEL A ,et al. Continuous control with deep reinforcement learning[J]. arXiv Preprint,arXiv:1509.02971, 2015.
[7]	BOSSHART P , DALY D , GIBB G ,et al. P4[J]. ACM SIGCOMM Computer Communication Review, 2014,44(3): 87-95.
[8]	DUAN Q , AL-SHAER E , JAFARIAN H . Efficient random route mutation considering flow and network constraints[C]// 2013 IEEE Conference on Communications and Network Security. Piscataway:IEEE Press, 2013: 260-268.
[9]	JAFARIAN J H , AL-SHAER E , DUAN Q . Formal approach for route agility against persistent attackers[C]// European Symposium on Research in Computer Security. Berlin:Springer, 2013: 237-254.
[10]	ZHAO Z , GONG D F , LU B ,et al. SDN-based double hopping communication against sniffer attack[J]. Mathematical Problems in Engineering, 2016,2016: 1-13.
[11]	ASEERI A , NETJINDA N , HEWETT R . Alleviating eavesdropping attacks in software-defined networking data plane[C]// Proceedings of the 12th Annual Conference on Cyber and Information Security Research. New York:ACM Press, 2017: 1-8.
[12]	ZHOU Z , XU C Q , KUANG X H ,et al. An efficient and agile spatio-temporal route mutation moving target defense mechanism[C]// 2019 IEEE International Conference on Communications. Piscataway:IEEE Press, 2019: 1-6.
[13]	DUAN Q , AL-SHAER E , CHATTERJEE S ,et al. Proactive routing mutation against stealthy distributed denial of service attacks:metrics,modeling,and analysis[J]. The Journal of Defense Modeling and Simulation:Applications,Methodology,Technology, 2018,15(2): 219-230.
[14]	LIU J , ZHANG H Q , GUO Z C . A defense mechanism of random routing mutation in SDN[J]. IEICE Transactions on Information and Systems, 2017,100(5): 1046-1054.
[15]	雷程, 马多贺, 张红旗 ,等. 基于最优路径跳变的网络移动目标防御技术[J]. 通信学报, 2017,38(3): 133-143.
	LEI C , MA D H , ZHANG H Q ,et al. Network moving target defense technique based on optimal forwarding path migration[J]. Journal on Communications, 2017,38(3): 133-143.
[16]	ZHANG T , KUANG X H , ZHOU Z ,et al. An intelligent route mutation mechanism against mixed attack based on security awareness[C]// 2019 IEEE Global Communications Conference. Piscataway:IEEE Press, 2019: 1-6.
[17]	ZHANG T , XU C Q , ZHANG B C ,et al. DQ-RM:deep reinforcement learning-based route mutation scheme for multimedia services[C]// 2020 International Wireless Communications and Mobile Computing. Piscataway:IEEE Press, 2020: 291-296.
[18]	KIM C , SIVARAMAN A , KATTA N ,et al. In-band network telemetry via programmable dataplanes[J]. ACM SIGCOMM, 2015,17: 1-2.
[19]	KANDULA S , KATABI D , SINHA S ,et al. Dynamic load balancing without packet reordering[J]. ACM SIGCOMM Computer Communication Review, 2007,37(2): 51-62.
[20]	KAUR K , SINGH J , GHUMMAN N S . Mininet as software defined networking testing platform[C]// International Conference on Communiction,Computing ＆ Systems. Piscataway:IEEE Press, 2014: 1-6.
[21]	WAXMAN B M . Routing of multipoint connections[J]. IEEE Journal on Selected Areas in Communications, 1988,6(9): 1617-1622.
[22]	ABADI M , BARHAM P , CHEN J ,et al. TensorFlow:a system for large-scale machine learning[C]// Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation. Berkeley:USENIX Association, 2016: 265-283.

转发端口	概率值
port ₁	P₁
port ₂	P₁
?	?
port_n	P_n

服务场景	应用层协议
文件传输	FTP
网页浏览	HTTP
视频通话	WebRTC
网络直播	RTSP

超参数	默认取值
T_slot	8s
th	8
TH	10
μ	1
γ	2
R_BAD	-6
m	3
η	0.05
τ	0.01
f_c	10

方法	网络协议
	TCP		UDP
	FTP	HTTP	WebRTC	RTSP
RRM	3.190	1.797	3.711	3.854
AT-RRM	2.553	1.801	3.802	3.263
SSO-RM	3.217	1.911	3.786	3.299
所提方法	3.454	2.009	5.183	5.499

方法	FTP	HTTP	WebRTC	RTSP
RRM	0.126 3	0.122 2	0.141 3	0.135 2
AT-RRM	0.077 6	0.075 9	0.081 9	0.080 1
SSO-RM	0.115 4	0.119 9	0.113 8	0.120 1
所提方法	0.003 9	0.005 1	0.004 7	0.004 9

基于深度确定性策略梯度的随机路由防御方法

Random routing defense method based on deep deterministic policy gradient

在线阅读

PDF下载

可视化

摘要/Abstract

引用本文

使用本文

图/表 15

参考文献 22

相关文章 8

Metrics

推荐阅读 0

方法	时间效率/s
RRM	0.051 499
AT-RRM	0.120 358
SSO-RM	0.061 295
所提方法	0.032 786

[1]	陈福才,何威振,程国振,霍树民,周大成. 基于DPDK的内网动态网关关键技术设计[J]. 通信学报, 2020, 41(6): 139-151.
[2]	赵小虎,王刚,宋泊明,于嘉成. 基于压缩感知的设备多源信息传输与分类算法[J]. 通信学报, 2020, 41(2): 13-24.
[3]	谭晶磊,张恒巍,张红旗,金辉,雷程. 基于Markov时间博弈的移动目标防御最优策略选取方法[J]. 通信学报, 2020, 41(1): 42-52.
[4]	蒋侣,张恒巍,王晋东. 基于信号博弈的移动目标防御最优策略选取方法[J]. 通信学报, 2019, 40(6): 128-137.
[5]	马多贺,李琼,林东岱. 基于POF的网络窃听攻击移动目标防御方法[J]. 通信学报, 2018, 39(2): 73-87.
[6]	雷程,马多贺,张红旗,韩琦,杨英杰. 基于最优路径跳变的网络移动目标防御技术[J]. 通信学报, 2017, 38(3): 133-143.
[7]	胡毅勋,郑康锋,杨义先,钮心忻. 基于OpenFlow的网络层移动目标防御方案[J]. 通信学报, 2017, 38(10): 102-112.
[8]	雷程,马多贺,张红旗,杨英杰,王淼. 基于变点检测的网络移动目标防御效能评估方法[J]. 通信学报, 2017, 38(1): 126-140.