Journal on Communications ›› 2021, Vol. 42 ›› Issue (9): 205-217.doi: 10.11959/j.issn.1000-436x.2021178
• Comprehensive Reviews • Previous Articles Next Articles
Li’na DU1,2, Li ZHUO1,2, Shuo YANG1,2, Jiafeng LI1,2, Jing ZHANG1,2
Revised:
2021-06-10
Online:
2021-09-25
Published:
2021-09-01
Supported by:
CLC Number:
Li’na DU, Li ZHUO, Shuo YANG, Jiafeng LI, Jing ZHANG. Survey on reinforcement learning based adaptive bit rate algorithm for mobile video streaming services[J]. Journal on Communications, 2021, 42(9): 205-217.
"
ABR算法 | 强化学习方法 | 奖励函数 |
文献[ | Q-learning | |
文献[ | Q-learning | |
文献[ | HMM | |
文献[ | MDP | |
文献[ | DP | |
文献[ | DP |
"
算法 | 业务类型 | 奖励函数 | 强化学习算法 | 网络轨迹数据集 | 性能评价指标 | |
平均QoE提升 | 节约带宽 | |||||
文献[ | 点播 | Actor -Critic | HSDPA FCC | Baseline | Baseline | |
文献[ | 点播 | Actor -Critic | HSDPA FCC | 30% | — | |
文献[ | 点播 | Q t=Pt+St+At,其中,Pt、St、At分别表示视频质量、卡顿时长、平滑度, | — | HSDPA FCC | 5.7% | 27.9% |
文献[ | 点播 | DP | FCC | — | — | |
文献[ | 点播 | double Q-learning&Dueling network | 4G LTE | — | — | |
文献[ | 点播 | 模仿学习 | HSDPA FCC Oboe | 7.37% | — | |
文献[ | 点播 | 同Pensieve算法一致 | Actor -Critic | HSDPA | 8.84% | — |
文献[ | 点播 | Actor -Critic | HSDPA FCC | 43.08% | 17.13% | |
文献[ | 直播 | Double -DQN | — | 15% | — | |
文献[ | 点播 | Actor -Critic | HSDPA FCC Belgium | 17% | — | |
文献[ | 点播 | Actor -Critic | 真实环境 | — | — |
"
数据库名称 | 发布时间 | QoE影响因素 | 原始视频数量 | 网络轨迹种类 | 失真视频数量 | 观看设备 | 视频感知质量度量准则 |
LIVAMVQA[ | 2012年 | 卡顿 | 10 | — | 200 | Phone, Tablet | MOS, MS-SSIM, SSIM, PSNR |
LIVE QHVS[ | 2014年 | 质量切换 | 3 | — | 15 | HDTV | MOS, MS-SSIM, SSIM, PSNR |
LIVE Mobile Stall Video Database-II[ | 2014年 | 卡顿 | 24 | — | 176 | Apple iPhone 5 | MOS |
LIVE Stall Study[ | 2017年 | 质量切换、卡顿 | 26 | — | 174 | PC | MOS |
LIVE-NFLX-II[ | 2018年 | 质量切换、卡顿 | 15 | 7 | 420 | Computer monitor | MOS, VMAF, SSIM, PSNR |
Waterloo SQoE-Ⅲ[ | 2018年 | 质量切换、卡顿、初始缓冲时间 | 20 | 13 | 450 | HDTV | MOS, VMAF, SSIM, PSNR |
Waterloo SQoE-IV[ | 2020年 | 质量切换、卡顿 | 5 | 9 | 1350 | Phone, HDTV, UHDTV | MOS, VMAF, SSIM |
"
数据库名称 | 发布时间 | 移动模式 | 时间间隔/持续时间 | 轨迹数目 | 带宽范围 |
HSDPA[ | 2013年 | 地铁、电车、火车、公共汽车、轮渡和小轿车 | 1 s/30 min | 86 | 0~3 Mbit/s |
FCC[ | 2016年 | — | 1 s/约3.7天 | 1 000 | — |
Belgium[ | 2016年 | 步行、自行车、公共汽车、电车、火车和小轿车 | 1 s/5 h | 40 | 0~111 Mbit/s |
3G&4G[ | 2016年 | 车辆行驶环境 | 4G-10 s,3G-15 s/约15 h,约38 h | 56 754 | 0~3 Mbit/s |
Oboe[ | 2018年 | — | — | 571 | 0~6 Mbit/s |
4G LTE[ | 2018年 | 静态、行人、汽车、公共汽车和火车 | 1 s/15 min | 135 | 0~173 Mbit/s |
5G[ | 2020年 | 应用程序(文件下载等)和移动性模式(静态、驾驶) | 1 s/约3 142 min | 83 | 0~1 Gbit/s |
[1] | Cisco. Cisco visual networking index:Forecast and methodology[R]. 2019. |
[2] | XIAO A L , LIU J , LI Y Z ,et al. Two-phase rate adaptation strategy for improving real-time video QoE in mobile networks[J]. China Communications, 2018,15(10): 12-24. |
[3] | ITU-T. Definition of quality of experience,international telecommunication union,liaison statement,Ref:TD109rev2 (PLEN/12)[S]. 2007. |
[4] | STOCKHAMMER T , . Dynamic adaptive streaming over HTTP:standards and design principles[C]// Proceedings of the Second Annual ACM Conference on Multimedia Systems. New York:ACM Press, 2011: 133-144. |
[5] | LEVKOV M . Video encoding and transcoding recommendations for HTTP dynamic streaming on the Adobe?Flash?Platform[R]. 2010. |
[6] | FECHEYR L A . A review of HTTP live streaming[R]. 2010. |
[7] | ZAMBELLI A . IIS smooth streaming technical overview[R]. 2009. |
[8] | KUA J , ARMITAGE G , BRANCH P . A survey of rate adaptation techniques for dynamic adaptive streaming over HTTP[J]. IEEE Communications Surveys & Tutorials, 2017,19(3): 1842-1866. |
[9] | AYAD I , IM Y , KELLER E ,et al. A practical evaluation of rate adaptation algorithms in HTTP-based adaptive streaming[J]. Computer Networks, 2018,133: 90-103. |
[10] | SUTTON R S , BARTO A G . Reinforcement learning:an introduction[M]. Cambridge: MIT Press, 1998. |
[11] | FRAN?OIS L , HENDERSON P , ISLAM R ,et al. An introduction to deep reinforcement learning[J]. Now Publishers, 2018,11(3-4): 219-354. |
[12] | CLAEYS M , LATRé S ,, FAMAEY J , et al . Design of a Q-learning based client quality selection algorithm for HTTP adaptive video streaming[C]// Adaptive and Learning Agents Workshop (ALA).[S.n.:s.l.], 2013: 30-37. |
[13] | CLAEYS M , LATRé S ,, FAMAEY J ,et al. Design and evaluation of a self-learning HTTP adaptive video streaming client[J]. IEEE Communications Letters, 2014,18(4): 716-719. |
[14] | SUN Y , YIN X Q , JIANG J C ,et al. CS2P:improving video bitrate selection and adaptation with data-driven throughput prediction[C]// Proceedings of the 2016 ACM SIGCOMM Conference.Florianopolis Brazil. New York:ACM Press, 2016: 272-285. |
[15] | CHIARIOTTI F , D’ARONCO S ,, TONI L , et al . Online learning adaptation strategy for DASH clients[C]// Proceedings of the 7th International Conference on Multimedia Systems. New York:ACM Press, 2016: 1-12. |
[16] | ANDELIN T , CHETTY V , HARBAUGH D ,et al. Quality selection for dynamic adaptive streaming over HTTP with scalable video coding[C]// Proceedings of the 3rd Multimedia Systems Conference on –MMSys’12. New York:ACM Press, 2012: 149-154. |
[17] | LI Z , BEGEN A C , GAHM J ,et al. Streaming video over HTTP with consistent quality[C]// Proceedings of the 5th ACM Multimedia Systems Conference on – MMSys’14. New York:ACM Press, 2014: 248-258. |
[18] | KELLY F P , MAULLOO A K , TAN D K H . Rate control for communication networks:shadow prices,proportional fairness and stability[J]. Journal of the Operational Research Society, 1998,49(3): 237-252. |
[19] | YIN X Q , JINDAL A , SEKAR V ,et al. A control-theoretic approach for dynamic adaptive video streaming over HTTP[J]. ACM SIGCOMM Computer Communication Review, 2015,45(4): 325-338. |
[20] | GARCíA S , CABRERA J , GARCíA N , . Quality-optimization algorithm based on stochastic dynamic programming for MPEG DASH video streaming[C]// IEEE International Conference on Consumer Electronics (ICCE),Piscataway:IEEE Press, 2014: 574-575. |
[21] | MNIH V , KAVUKCUOGLU K , SILVER D ,et al. Playing atari with deep reinforcement learning[J]. arXiv Preprint,arXiv:1312.5602, 2013. |
[22] | MAO H Z , NETRAVALI R , ALIZADEH M . Neural adaptive video streaming with Pensieve[C]// Proceedings of the Conference of the ACM Special Interest Group on Data Communication. New York:ACM Press, 2017: 197-210. |
[23] | MNIH V , BADIA A P , MIRZA M ,et al. Asynchronous methods for deep reinforcement learning[C]// International Conference on Machine Learning (ICML). New York:ACM Press, 2016: 1928-1937. |
[24] | PAUL C , HUDSON A . CS 244’18:recreating and extending pensieve[R]. 2018. |
[25] | SENGUPTA S , GANGULY N , CHAKRABORTY S ,et al. HotDASH:hotspot aware adaptive video streaming using deep reinforcement learning[C]// 2018 IEEE 26th International Conference on Network Protocols (ICNP). Piscataway:IEEE Press, 2018: 165-175. |
[26] | GAO G Y , DONG L S , ZHANG H Z ,et al. Content-aware personalised rate adaptation for adaptive streaming via deep video analysis[C]// 2019 IEEE International Conference on Communications (ICC). Piscataway:IEEE Press, 2019: 1-8. |
[27] | HU S H , XU M , ZHANG H M ,et al. Affective content-aware adaptation scheme on QoE optimization of adaptive streaming over HTTP[J]. ACM Transactions on Multimedia Computing,Communications,and Applications, 2020,15(3s): 100. |
[28] | ZHANG X , OU Y Y , SEN S ,et al. SENSEI:aligning video streaming quality with dynamic user sensitivity[C]// USENIX Symposium on Networked Systems Design and Implementation (NSDI). Berkeley:USENIX Association, 2021: 303-320. |
[29] | DUANMU Z F , LIU W T , CHEN D Q ,et al. A knowledge-driven quality-of-experience model for adaptive streaming videos[J]. arXiv Preprint,arXiv:1911.07944, 2019. |
[30] | XIAO A L , HUANG X F , WU S ,et al. Traffic-aware rate adaptation for improving time-varying QoE factors in mobile video streaming[J]. IEEE Transactions on Network Science and Engineering, 2020,7(4): 2392-2405. |
[31] | HUANG T C , YAO X , WU C L ,et al. Tiyuntsong:a self-play reinforcement learning approach for ABR video streaming[C]// IEEE International Conference on Multimedia and Expo (ICME). Piscataway:IEEE Press, 2019: 1678-1683. |
[32] | HUANG T C , ZHANG R X , SUN L F . Self-play reinforcement learning for video transmission[C]// Proceedings of the 30th ACM Workshop on Network and Operating Systems Support for Digital Audio and Video. New York:ACM Press, 2020: 1-10. |
[33] | SCHULMAN J , WOLSKI F , DHARIWAL P ,et al. Proximal policy optimization algorithms[J]. arXiv Preprint,arXiv:1707.06347, 2017. |
[34] | HUO L Y , WANG Z L , XU M ,et al. A meta-learning framework for learning multi-user preferences in QoE optimization of DASH[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2020,30(9): 3210-3225. |
[35] | FRANS K , HO J , CHEN X ,et al. Meta learning shared hierarchies[C]// International Conference on Learning Representations (ICLR).[S.n.:s.l.], 2018: 1-11. |
[36] | LIU J , TAO X M , LU J H . QoE-oriented rate adaptation for DASH with enhanced deep Q-learning[J]. IEEE Access, 2019,7: 8454-8469. |
[37] | SALEEM M , SALEEM Y , ASIF H M S ,et al. Quality enhanced multimedia content delivery for mobile cloud with deep reinforcement learning[J]. Wireless Communications and Mobile Computing, 2019,2019: 1-15. |
[38] | HUANG T C , ZHOU C , ZHANG R X ,et al. Comyco:quality-aware adaptive video streaming via imitation learning[C]// Proceedings of the 27th ACM International Conference on Multimedia. New York:ACM Press, 2019: 429-437. |
[39] | LI W H , HUANG J W , WANG S Q ,et al. DAVS:dynamic-chunk quality aware adaptive video streaming using apprenticeship learning[C]// GLOBECOM 2020 - 2020 IEEE Global Communications Conference. Piscataway:IEEE Press, 2020: 1-6. |
[40] | GADALETA M , CHIARIOTTI F , ROSSI M ,et al. D-DASH:a deep Q-learning framework for DASH video streaming[J]. IEEE Transactions on Cognitive Communications and Networking, 2017,3(4): 703-718. |
[41] | LEKHARU A , MOULII K Y , SUR A ,et al. Deep learning based prediction model for adaptive video streaming[C]// 2020 International Conference on Communication Systems & Networks (COMSNETS). Piscataway:IEEE Press, 2020: 152-159. |
[42] | ZHANG G H , LEE J Y B . Ensemble adaptive streaming – A new paradigm to generate streaming algorithms via specializations[J]. IEEE Transactions on Mobile Computing, 2020,19(6): 1346-1358. |
[43] | ZHAO Y , SHEN Q W , LI W ,et al. Latency aware adaptive video streaming using ensemble deep reinforcement learning[C]// Proceedings of the 27th ACM International Conference on Multimedia. New York:ACM Press, 2019: 2647-2651. |
[44] | AKHTAR Z , NAM Y S , GOVINDAN R ,et al. Oboe:auto-tuning video ABR algorithms to network conditions[C]// Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication. New York:ACM Press, 2018: 44-58. |
[45] | YEO H , JUNG Y , KIM J ,et al. Neural adaptive content-aware internet video delivery[C]// Symposium on Operating Systems Design and Implementation (OSDI).[S.n.:s.l.], 2018: 645-661. |
[46] | HUANG T C , ZHANG R X , ZHOU C ,et al. QARC:video quality aware rate control for real-time video streaming based on deep reinforcement learning[C]// Proceedings of the 26th ACM international conference on Multimedia. New York:ACM Press, 2018: 1208-1216. |
[47] | HUANG T C , ZHANG R X , WU C L ,et al. Generalizing rate control strategies for realtime video streaming via learning from deep learning[C]// MMAsia '19:Proceedings of the ACM Multimedia Asia. New York:ACM Press, 2019: 1-6. |
[48] | TIAN Z , ZHAO L P , NIE L H ,et al. Deeplive:QoE optimization for live video streaming through deep reinforcement learning[C]// 2019 IEEE 25th International Conference on Parallel and Distributed Systems (ICPADS). Piscataway:IEEE Press, 2019: 827-831. |
[49] | NASRABADI A T , PRAKASH R . Layer-assisted adaptive video streaming[C]// Proceedings of the 28th ACM SIGMM Workshop on Network and Operating Systems Support for Digital Audio and Video.Amsterdam Netherlands. New York:ACM Press, 2018: 31-36. |
[50] | LIU Y Z , JIANG B , GUO T ,et al. Grad:learning for overhead-aware adaptive video streaming with scalable video coding[C]// Proceedings of the 28th ACM International Conference on Multimedia. New York:ACM Press, 2020: 1-9. |
[51] | WANG Y N , WANG H L , SHANG J Y ,et al. RESA:a real-time evaluation system for ABR[C]// 2019 IEEE International Conference on Multimedia and Expo (ICME). Piscataway:IEEE Press, 2019: 1846-1851. |
[52] | HUANG T C , ZHOU C , YAO X ,et al. Quality-aware neural adaptive video streaming with lifelong imitation learning[J]. IEEE Journal on Selected Areas in Communications, 2020,38(10): 2324-2342. |
[53] | MA N N , ZHANG X Y , ZHENG H T ,et al. ShuffleNet V2:practical guidelines for efficient CNN architecture design[C]// Computer Vision- ECCV 2018. Berlin:Springer, 2018: 116-131. |
[54] | HUANG T C , ZHOU C , ZHANG R X ,et al. Stick:a harmonious fusion of buffer-based and learning-based approach for adaptive streaming[C]// IEEE INFOCOM 2020 - IEEE Conference on Computer Communications. Piscataway:IEEE Press, 2020: 1967-1976. |
[55] | HUANG T Y , JOHARI R , MCKEOWN N ,et al. A buffer-based approach to rate adaptation:Evidence from a large video streaming service[C]// Special Interest Group on Data Communication (SIGCOMM). New York:ACM Press, 2014: 187-198. |
[56] | RIISER H , VIGMOSTAD P , GRIWODZ C ,et al. Commute path bandwidth traces from 3G networks:analysis and applications[C]// Proceedings of the 4th ACM Multimedia Systems Conference on - MMSys '13. New York:ACM Press, 2013: 114-118. |
[57] | US Federal Communications Commission . Measuring Fixed Broadband Report[R]. 2016. |
[58] | RACA D , QUINLAN J J , ZAHRAN A H ,et al. Beyond throughput:a 4G LTE dataset with channel and context metrics[C]// Proceedings of the 9th ACM Multimedia Systems Conference. New York:ACM Press, 2018: 460-465. |
[59] | HOOFT J V D , PETRANGELI S , WAUTERS T ,et al. HTTP/2-based adaptive streaming of HEVC video over 4G/LTE networks[J]. IEEE Communications Letters, 2016,20(11): 2177-2180. |
[60] | MENG Z L , CHEN J , GUO Y N ,et al. PiTree:practical implementation of ABR algorithms using decision trees[C]// Proceedings of the 27th ACM International Conference on Multimedia. New York:ACM Press, 2019: 2431-2439. |
[61] | MOORTHY A K , CHOI L K , BOVIK A C ,et al. Video quality assessment on mobile devices:subjective,behavioral and objective studies[J]. IEEE Journal of Selected Topics in Signal Processing, 2012,6(6): 652-671. |
[62] | CHEN C , CHOI L K , VECIANA G D ,et al. Modeling the time—varying subjective quality of HTTP video streams with rate adaptations[J]. IEEE Transactions on Image Processing, 2014,23(5): 2206-2221. |
[63] | GHADIYARAM D , BOVIK A C , YEGANEH H ,et al. Study of the effects of stalling events on the quality of experience of mobile streaming videos[C]// 2014 IEEE Global Conference on Signal and Information Processing (GlobalSIP). Piscataway:IEEE Press, 2014: 989-993. |
[64] | GHADIYARAM D , PAN J , BOVIK A C . A subjective and objective study of stalling events in mobile streaming videos[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2019,29(1): 183-197. |
[65] | BAMPIS C G , LI Z , KATSAVOUNIDIS I ,et al. Towards perceptually optimized adaptive video streaming-a realistic quality of experience database[J]. IEEE Transactions on Image Processing, 2021,30: 5182-5197. |
[66] | DUANMU Z F , REHMAN A , WANG Z . A quality-of-experience database for adaptive video streaming[J]. IEEE Transactions on Broadcasting, 2018,64(2): 474-487. |
[67] | DUANMU Z F , CHEN D , LI Z W ,et al. Assessing the quality-of-experience of adaptive bitrate video streaming[J]. arXiv Preprint,arXiv:2008.08804, 2020. |
[68] | BOKANI A , HASSAN M , KANHERE S S ,et al. Comprehensive mobile bandwidth traces from vehicular networks[C]// Proceedings of the 7th International Conference on Multimedia Systems. New York:ACM Press, 2016: 344-348. |
[69] | RACA D , LEAHY D , SREENAN C J ,et al. Beyond throughput,the next generation:a 5G dataset with channel and context metrics[C]// Proceedings of the 11th ACM Multimedia Systems Conference. New York:ACM Press, 2020: 303-308. |
[70] | SPITERI K , URGAONKAR R , SITARAMAN R K . BOLA:near-optimal bitrate adaptation for online videos[J]. IEEE/ACM Transactions on Networking, 2020,28(4): 1698-1711. |
[71] | JIANG J C , SEKAR V , ZHANG H . Improving fairness,efficiency,and stability in HTTP-based adaptive video streaming with festive[J]. IEEE/ACM Transactions on Networking, 2014,22(1): 326-340. |
[1] | Dongyu CHEN, Hua CHEN, Limin FAN, Yifang FU, Jian WANG. Research on test strategy for randomness based on deep learning [J]. Journal on Communications, 2023, 44(6): 23-33. |
[2] | Rongpeng LI, Bingyan WANG, Honggang ZHANG, Zhifeng ZHAO. Design of knowledge enhanced semantic communication receiver [J]. Journal on Communications, 2023, 44(6): 70-76. |
[3] | Ling MA, Qiliang FAN, Ting XU, Guanchen GUO, Shenglin ZHANG, Yongqian SUN, Yuzhi ZHANG. Scheduling framework based on reinforcement learning in online-offline colocated cloud environment [J]. Journal on Communications, 2023, 44(6): 90-102. |
[4] | Biao JIN, Yikang LI, Zhiqiang YAO, Yulin CHEN, Jinbo XIONG. GenFedRL: a general federated reinforcement learning framework for deep reinforcement learning agents [J]. Journal on Communications, 2023, 44(6): 183-197. |
[5] | Shuai MA, Ke PEI, Huayan QI, Hang LI, Wen CAO, Hongmei WANG, Hailiang XIONG, Shiyin LI. Research on geomagnetic indoor high-precision positioning algorithm based on generative model [J]. Journal on Communications, 2023, 44(6): 211-222. |
[6] | Yuancheng LI, Yongtai QIN. Deep reinforcement learning based algorithm for real-time QoS optimization of software-defined security middle platform [J]. Journal on Communications, 2023, 44(5): 181-192. |
[7] | Dacheng ZHOU, Hongchang CHEN, Weizhen HE, Guozhen CHENG, Hongchao HU. Research on multidimensional dynamic defense strategy for microservice based on deep reinforcement learning [J]. Journal on Communications, 2023, 44(4): 50-63. |
[8] | Guoliang XU, Feng TAN, Yongyi RAN, Feng CHEN. Joint beam hopping and coverage control optimization algorithm for multibeam satellite system [J]. Journal on Communications, 2023, 44(4): 78-86. |
[9] | Wenjun XU, Silei WU, Fengyu WANG, Lan LIN, Guojun LI, Zhi ZHANG. Large-scale post-disaster user distributed coverage optimization based on multi-agent reinforcement learning [J]. Journal on Communications, 2022, 43(8): 1-16. |
[10] | Zongxuan SHA, Ru HUO, Chuang SUN, Shuo WANG, Tao HUANG. Forwarding efficiency aware traffic scheduling algorithm based on deep reinforcement learning [J]. Journal on Communications, 2022, 43(8): 30-40. |
[11] | Shuai MA, Bing LI, Haihong SHENG, Rongyan GU, Hui ZHOU, Hongmei WANG, Yue WANG, Shiyin LI. Research on power allocation of integrated VLPC based on deep reinforcement learning [J]. Journal on Communications, 2022, 43(8): 121-130. |
[12] | Yu ZHANG, Min CHENG. Joint optimization of edge computing and caching in NDN [J]. Journal on Communications, 2022, 43(8): 164-175. |
[13] | Jie YANG, Biao DONG, Xue FU, Yu WANG, Guan GUI. Lightweight decentralized learning-based automatic modulation classification method [J]. Journal on Communications, 2022, 43(7): 134-142. |
[14] | Xiuzhang YANG, Guojun PENG, Zichuan LI, Yangqi LYU, Side LIU, Chenguang LI. Research on entity recognition and alignment of APT attack based on Bert and BiLSTM-CRF [J]. Journal on Communications, 2022, 43(6): 58-70. |
[15] | Peiliang ZUO, Shaolong HOU, Chao GUO, Hua JIANG, Wenbo WANG. Security decision method for the edge of multi-layer satellite network based on reinforcement learning [J]. Journal on Communications, 2022, 43(6): 189-199. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||
|