通信学报 ›› 2022, Vol. 43 ›› Issue (8): 30-40.doi: 10.11959/j.issn.1000-436x.2022148

• 学术论文 • 上一篇    下一篇

基于深度强化学习的转发效能感知流量调度算法

沙宗轩1, 霍如1,2, 孙闯3, 汪硕2,4, 黄韬2,4   

  1. 1 北京工业大学信息学部,北京 100124
    2 网络通信与安全紫金山实验室, 江苏 南京 211111
    3 清华大学自动化系,北京 100084
    4 北京邮电大学网络与交换国家重点实验室,北京 100876
  • 修回日期:2022-07-19 出版日期:2022-08-25 发布日期:2022-08-01
  • 作者简介:沙宗轩(1990- ),男,回族,安徽蚌埠人,北京工业大学博士生,主要研究方向为未来网络、网络人工智能深度学习、强化学习等
    霍如(1988- ),女,黑龙江哈尔滨人,博士,北京工业大学讲师,主要研究方向为未来网络、工业互联网、边缘计算、网络资源管理、区块链等
    孙闯(1989- ),男,黑龙江哈尔滨人,博士,清华大学在站博士后,主要研究方向为传感器测量与测试技术、初始仪器精密标定方法和惯性导航技术等
    汪硕(1991- ),男,河南灵宝人,博士,北京邮电大学讲师,主要研究方向为数据中心网络、软件定义网络、网络流量调度等
    黄韬(1980- ),男,重庆人,博士,北京邮电大学教授,主要研究方向为未来网络体系架构、软件定义网络、网络虚拟化等
  • 基金资助:
    2020年工业互联网创新发展工程基金资助项目(工业互联网标识资源搜索系统);中国通信学会青年人才托举计划-托举培养协议基金资助项目(YESS20200287)

Forwarding efficiency aware traffic scheduling algorithm based on deep reinforcement learning

Zongxuan SHA1, Ru HUO1,2, Chuang SUN3, Shuo WANG2,4, Tao HUANG2,4   

  1. 1 Information Department, Beijing University of Technology, Beijing 100124, China
    2 Purple Mountain Laboratories, Nanjing 211111, China
    3 Department of Automation, Tsinghua University, Beijing 100084, China
    4 State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing 100876, China
  • Revised:2022-07-19 Online:2022-08-25 Published:2022-08-01
  • Supported by:
    The MIIT of China 2020 (Identification Resources Search System for Industrial Internet of Things);Young Elite Scientist Sponsorship Program by CAST and China-CIC(YESS20200287)

摘要:

软件定义网络(SDN)通过将控制平面与数据平面分离,可实现灵活的流量调度,更有效地利用网络资源。但是,流表项数量、设备负载率及连接主机数量增加等因素的共同作用会导致 SDN 交换机的转发效能降低,进而影响端到端的数据传输时延。为了解决上述问题,提出了基于深度强化学习的转发效能感知流量调度算法。首先,将交换机状态信息统一到感知模型中,通过神经网络建立交换机状态信息和转发效能的映射关系。然后,结合网络状态和流量信息,通过深度强化学习产生流量调度策略。最后,通过由最短路径和负载均衡算法产生的专家样本引导模型训练,不仅使模型学习到专家样本的知识以提升性能,同时提升模型训练效率。实验结果表明,与其他算法相比,所提算法不仅使端到端的平均传输时延降低了15.31%,而且保证了网络整体的负载均衡。

关键词: 软件定义网络, 深度强化学习, 流量调度, 转发效能感知, 负载均衡

Abstract:

The software defined network separates the control plane from the data plane to achieve flexible traffic scheduling, which can use network resources more efficiently.However, with the increase of the number of flow entries, load rate, the number of connected hosts, and other factors, the forwarding efficiency of the SDN switch will be reduced, which will affect the end-to-end transmission delay.To solve the above problems, the forwarding efficiency aware traffic scheduling algorithm based on deep reinforcement learning was proposed.First, the switch state was integrated into the perception model, and the mapping relationship between switch state information and forwarding efficiency was established based on neural network.Then, combined with network state and traffic information, traffic scheduling policy was generated through deep reinforcement learning.Finally, the expert samples generated by the shortest path and load balance algorithms could guide the model training, which enabled the model to learn knowledge from expert samples to improve performance and accelerated the training process.The experimental results show that the proposed algorithm not only reduces the average end-to-end transmission delay by 15.31%, but also ensures the overall load balance of the network, compared with other algorithms.

Key words: software defined network, deep reinforcement learning, traffic scheduling, forwarding efficiency aware, load balance

中图分类号: 

No Suggested Reading articles found!