通信学报 ›› 2019, Vol. 40 ›› Issue (12): 60-67.doi: 10.11959/j.issn.1000-436x.2019227

• 学术论文 • 上一篇    下一篇

基于深度强化学习的软件定义网络QoS优化

兰巨龙,张学帅(),胡宇翔,孙鹏浩   

  1. 国家数字交换系统工程技术研究中心,河南 郑州 450001
  • 修回日期:2019-10-28 出版日期:2019-12-25 发布日期:2020-01-16
  • 作者简介:兰巨龙(1962- ),男,河北张家口人,博士,国家数字交换系统工程技术研究中心教授、博士生导师,主要研究方向为未来信息通信网络关键理论与技术|张学帅(1994- ),男,山东菏泽人,国家数字交换系统工程技术研究中心硕士生,主要研究方向为软件定义网络|胡宇翔(1982- ),男,河南周口人,博士,国家数字交换系统工程技术研究中心副教授、博士生导师,主要研究方向为未来网络关键技术、网络智慧化等|孙鹏浩(1992- ),男,山东青岛人,国家数字交换系统工程技术研究中心博士生,主要研究方向为软件定义网络、流量工程等
  • 基金资助:
    国家重点研发计划基金资助项目(2017YFB0803204);国家自然科学基金资助项目(61521003);国家自然科学基金资助项目(61702547);国家自然科学基金资助项目(61872382);广东省重点领域研发计划基金资助项目(2018B010113001)

Software-defined networking QoS optimization based on deep reinforcement learning

Julong LAN,Xueshuai ZHANG(),Yuxiang HU,Penghao SUN   

  1. National Digital Switching System Engineering &Research Center,Zhengzhou 450001,China
  • Revised:2019-10-28 Online:2019-12-25 Published:2020-01-16
  • Supported by:
    The National Key Research and Development Program of China(2017YFB0803204);The National Natural Science Foundation of China(61521003);The National Natural Science Foundation of China(61702547);The National Natural Science Foundation of China(61872382);The Research and Development Program in Key Areas of Guangdong Province(2018B010113001)

摘要:

为解决软件定义网络场景中,当前主流的基于启发式算法的QoS优化方案常因参数与网络场景不匹配出现性能下降的问题,提出了基于深度强化学习的软件定义网络QoS优化算法。首先将网络资源和状态信息统一到网络模型中,然后通过长短期记忆网络提升算法的流量感知能力,最后基于深度强化学习生成满足QoS目标的动态流量调度策略。实验结果表明,相对于现有算法,所提算法不但保证了端到端传输时延和分组丢失率,而且提高了22.7%的网络负载均衡程度,增加了8.2%的网络吞吐率。

关键词: 软件定义网络, 深度强化学习, 长短期记忆, 服务质量

Abstract:

To solve the problem that the QoS optimization schemes which based on heuristic algorithm degraded often due to the mismatch between parameters and network characteristics in software-defined networking scenarios,a software-defined networking QoS optimization algorithm based on deep reinforcement learning was proposed.Firstly,the network resources and state information were integrated into the network model,and then the flow perception capability was improved by the long short-term memory,and finally the dynamic flow scheduling strategy,which satisfied the specific QoS objectives,were generated in combination with deep reinforcement learning.The experimental results show that,compared with the existing algorithms,the proposed algorithm not only ensures the end-to-end delay and packet loss rate,but also improves the network load balancing by 22.7% and increases the throughput by 8.2%.

Key words: software-defined networking, deep reinforcement learning, long short-term memory, quality of service

中图分类号: 

No Suggested Reading articles found!