物联网学报 ›› 2022, Vol. 6 ›› Issue (4): 1-13.doi: 10.11959/j.issn.2096-3750.2022.00306

• 理论与技术 •    下一篇

基于强化学习的实时视频流控与移动终端训练方法研究

张欢欢, 周安福, 马华东   

  1. 北京邮电大学智能通信软件与多媒体北京市重点实验室,北京 100876
  • 修回日期:2022-10-17 出版日期:2022-12-30 发布日期:2022-12-01
  • 作者简介:张欢欢(1994- ),女,博士,北京邮电大学在站博士后,主要研究方向为物联网、移动计算、视频传输
    周安福(1981- ),男,博士,北京邮电大学教授,主要研究方向为物联网感知、毫米波、实时视频传输
    马华东(1964- ),男,博士,北京邮电大学教授,IEEE Fellow,主要研究方向为物联网与传感网、多媒体系统与网络
  • 基金资助:
    国家自然科学基金资助项目(61921003);博士后创新人才支持计划(BX20220046)

Reinforcement learning-based real-time video streaming control and on-device training research

Huanhuan ZHANG, Anfu ZHOU, Huadong MA   

  1. Beijing University of Posts and Telecommunications, Beijing Key Lab of Intelligent Telecommunication Software and Multimedia, Beijing 100876, China
  • Revised:2022-10-17 Online:2022-12-30 Published:2022-12-01
  • Supported by:
    The National Natural Science Foundation of China(61921003);The China National Postdoctoral Program for Innovative Talents(BX20220046)

摘要:

以物联网、移动互联网为核心的服务平台加速发展,数以亿计的终端用户通过实时视频进行通信,实时视频已成为人们数字化生活中不可替代的核心工具。然而,互联网络呈现高动态、强异构的特性,对实时视频的流控技术提出了严格要求,用户体验质量仍然不佳。设计了适用于异构网络环境的强化学习驱动的自适应流控算法、研发了移动终端训练技术以降低服务端开销,并对算法的设计及结构进行了深入的评测研究。实验表明,所设计的自适应流控算法可以有效地预测网络带宽,相较于国际代表性的流控算法,将预测带宽误差降低了48.48%。有效的带宽预测进一步提升了视频用户体验质量,如视频流畅度提升了 60.65%、视频清晰度提升了 16.52%。此外,测评分析可为实时视频流优化方案提供经验性指导,有力推动智能视频应用的发展。

关键词: 实时视频, 自适应流控, 体验质量, 强化学习, 终端训练

Abstract:

Service platforms centered on the Internet of things and mobile Internet are in accelerating process.Hundreds of millions of end-users communicate through network real-time video services, which have become an irreplaceable core tool in human’s digital life.However, the Internet is becoming dynamic, and heterogeneous, which imposes stringent requirements on real-time video streaming control technology.Moreover, the QoE of real-time video is not satisfactory.An adaptive reinforcement learning-based video intelligent transmission algorithm was designed, which can deal with heterogeneous network environment.And then, an effective end-to-end on-device training framework was designed to decrease server overhead, and a detailed evaluation and analysis on the neural network design and structure was provided.Experimental results show that the proposed algorithm can effectively predict heterogeneous network bandwidth, and reduces the bandwidth prediction error by 48.48%, comparing with the representative streaming control algorithm.The effective bandwidth prediction can further improve the user QoE, such as improving the video fluency by 60.65%, and improving the video quality by 16.52%.Besides, the analysis can provide empirical insights for further study, and holds potential to push the development of intelligent video applications.

Key words: real-time video, adaptive streaming control, quality-of-experience, reinforcement learning, on-device training

中图分类号: 

No Suggested Reading articles found!