物联网学报 ›› 2023, Vol. 7 ›› Issue (1): 73-82.doi: 10.11959/j.issn.2096-3750.2023.00316

• 理论与技术 • 上一篇    下一篇

基于深度强化学习的SDN服务质量智能优化算法

廖岑卉珊1, 陈俊彦1, 梁观平2, 谢小兰1, 卢小烨1   

  1. 1 桂林电子科技大学计算机与信息安全学院,广西 桂林 541004
    2 国防科技大学计算机学院,湖南 长沙 410073
  • 修回日期:2022-12-04 出版日期:2023-03-30 发布日期:2023-03-01
  • 作者简介:廖岑卉珊(1999- ),女,桂林电子科技大学硕士生,主要研究方向为软件定义网络、深度强化学习
    陈俊彦(1985- ),男,博士,桂林电子科技大学高级实验师,主要研究方向为强化学习、图神经网络和软件定义网络
    梁观平(1998- ),男,国防科技大学博士生,主要研究方向为软件定义网络、流量调度与拥塞控制
    谢小兰(1999- ),女,桂林电子科技大学硕士生,主要研究方向为软件定义网络、图神经网络和深度强化学习
    卢小烨(2000- ),男,桂林电子科技大学在读,主要研究方向为软件定义网络、深度强化学习
  • 基金资助:
    广西自然科学基金资助项目(2020GXNSFDA238001);广西高校中青年教师科研基础能力提升项目(2020KY05033)

Quality of service optimization algorithm based on deep reinforcement learning in software defined network

Cenhuishan LIAO1, Junyan CHEN1, Guanping LIANG2, Xiaolan XIE1, Xiaoye LU1   

  1. 1 School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, China
    2 College of Computer, National University of Defense Technology, Changsha 410073, China
  • Revised:2022-12-04 Online:2023-03-30 Published:2023-03-01
  • Supported by:
    The Guangxi Natural Science Foundation(2020GXNSFDA238001);The Guangxi Project to Improve the Scientific Research Basic Ability of Middle Aged and Young Teachers(2020KY05033)

摘要:

深度强化学习具有较强的决策能力和泛化能力,常被应用于软件定义网络(SDN, software defined network)的服务质量(QoS, quality of service)优化中。但传统深度强化学习算法存在收敛速度慢和不稳定等问题。提出一种基于深度强化学习的服务质量优化算法(AQSDRL, algorithm of quality of service optimization based on deep reinforcement learning),以解决SDN在数据中心网络(DCN, data center network)应用中的QoS问题。AQSDRL引入基于softmax估计的深层双确定性策略梯度(SD3, softmax deep double deterministic policy gradient)算法实现模型训练,并采用基于 SumTree 的优先级经验回放机制优化 SD3 算法,以更大的概率抽取具有更显著时序差分误差(TD-error, temporal-difference error)的样本来训练神经网络,有效提升算法的收敛速度和稳定性。实验结果表明,所提AQSDRL与现有的深度强化学习算法相比能够有效降低网络传输时延,且提高网络的负载均衡性能。

关键词: 深度强化学习, 软件定义网络, 服务质量, 数据中心网络, SumTree

Abstract:

Deep reinforcement learning has strong abilities of decision-making and generalization and often applies to the quality of service (QoS) optimization in software defined network (SDN).However, traditional deep reinforcement learning algorithms have problems such as slow convergence and instability.An algorithm of quality of service optimization algorithm of based on deep reinforcement learning (AQSDRL) was proposed to solve the QoS problem of SDN in the data center network (DCN) applications.AQSDRL introduces the softmax deep double deterministic policy gradient (SD3) algorithm for model training, and a SumTree-based prioritized empirical replay mechanism was used to optimize the SD3 algorithm.The samples with more significant temporal-difference error (TD-error) were extracted with higher probability to train the neural network, effectively improving the convergence speed and stability of the algorithm.The experimental results show that the proposed AQSDRL effectively reduces the network transmission delay and improves the load balancing performance of the network than the existing deep reinforcement learning algorithms.

Key words: deep reinforcement learning, SDN, QoS, DCN, SumTree

中图分类号: 

No Suggested Reading articles found!