通信学报 ›› 2021, Vol. 42 ›› Issue (9): 205-217.doi: 10.11959/j.issn.1000-436x.2021178

• 综述 • 上一篇    下一篇

基于强化学习的移动视频流业务码率自适应算法研究进展

杜丽娜1,2, 卓力1,2, 杨硕1,2, 李嘉锋1,2, 张菁1,2   

  1. 1 北京工业大学计算智能与智能系统北京市重点实验室,北京 100124
    2 北京工业大学信息学部,北京 100124
  • 修回日期:2021-06-10 出版日期:2021-09-25 发布日期:2021-09-01
  • 作者简介:杜丽娜(1995− ),女,山西忻州人,北京工业大学博士生,主要研究方向为视频质量评价、码率自适应算法
    卓力(1971− ),女,江苏徐州人,博士,北京工业大学教授、博士生导师,主要研究方向为图像/视频的编码与传输、多媒体大数据处理等
    杨硕(1993− ),男,河南商丘人,北京工业大学硕士生,主要研究方向为视频质量评价
    李嘉锋(1986− ),男,天津人,博士,北京工业大学讲师、硕士生导师,主要研究方向为计算机视觉、图像增强
    张菁(1975− ),女,广东梅县人,博士,北京工业大学教授、博士生导师,主要研究方向为图像/视频处理、图像识别和图像检索
  • 基金资助:
    国家自然科学基金资助项目(61531006);北京市教委-市基金联合资助项目(KZ201910005007)

Survey on reinforcement learning based adaptive bit rate algorithm for mobile video streaming services

Li’na DU1,2, Li ZHUO1,2, Shuo YANG1,2, Jiafeng LI1,2, Jing ZHANG1,2   

  1. 1 Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, Beijing 100124, China
    2 Information Department, Beijing University of Technology, Beijing 100124, China
  • Revised:2021-06-10 Online:2021-09-25 Published:2021-09-01
  • Supported by:
    The National Natural Science Foundation of China(61531006);Beijing Municipal Education Commission Cooperation Beijing Natural Science Foundation(KZ201910005007)

摘要:

近几年来,随着HTTP自适应流媒体(HAS)视频数据集和网络轨迹数据集的不断推出,强化学习、深度学习等机器学习方法被不断应用到码率自适应(ABR)算法中,通过交互学习来确定码率控制的最优策略,取得了远超过传统启发式方法的性能。在分析 ABR 算法研究难点的基础上,重点阐述了基于强化学习(包括深度强化学习)的ABR算法研究进展。此外,总结了代表性的HAS视频数据集和网络轨迹数据集,介绍了算法性能的评价准则,最后探讨了ABR研究目前存在的问题和未来的方向。

关键词: 强化学习, 码率自适应算法, 用户质量体验, 深度学习, 深度强化学习

Abstract:

In recent years, with the continuous release of HTTP adaptive streaming (HAS) video datasets and network trace datasets, the machine learning methods, such as deep learning and reinforcement learning, have been continuously applied to adaptive bit rate (ABR) algorithms, which obtain the optimal strategy of rate control through interactive learning, and achieve superior performance that surpasses the traditional heuristic methods.Based on the analysis of the research difficulties of ABR algorithms, the research advances of ABR algorithms based on reinforcement learning (including deep reinforcement learning) was investigated.Furthermore, several representative HAS video datasets and network trace datasets were summarized, the evaluation metrics of the performance were depicted.Finally, the existing problems and the future tendency of ABR research were discussed.

Key words: reinforcement learning, ABR algorithm, QoE, deep learning, deep reinforcement learning

中图分类号: 

No Suggested Reading articles found!