电信科学 ›› 2019, Vol. 35 ›› Issue (5): 32-42.doi: 10.11959/j.issn.1000-0801.2019142
贾川民,赵政辉,王苫社,马思伟
修回日期:
2019-05-08
出版日期:
2019-05-20
发布日期:
2019-05-21
作者简介:
贾川民(1993- ),男,北京大学博士生,主要研究方向为图像视频编码与处理。|赵政辉(1993- ),男,北京大学博士生,主要研究方向为图像视频编码与处理。|王苫社(1981- ),男,博士,北京大学助理研究员,主要研究方向为视频编码与视频处理。|马思伟(1979- ),男,博士,北京大学教授、博士生导师,主要研究方向为视频编码与处理。
基金资助:
Chuanmin JIA,Zhenghui ZHAO,Shanshe WANG,Siwei MA
Revised:
2019-05-08
Online:
2019-05-20
Published:
2019-05-21
Supported by:
摘要:
深度神经网络近年来在人工智能领域进展显著,并引发广泛深入研究神经网络的热潮,近期基于神经网络的图像视频编码也成为热点研究问题之一。系统梳理了基于神经网络的图像视频编码技术及进展,对基于多层感知机、随机神经网络、卷积神经网络、循环神经网络、生成对抗网络等框架的图像压缩,以及基于深度学习的各类视频编码工具进行了综述介绍,同时对神经网络编码的未来发展趋势进行了分析与展望。
中图分类号:
贾川民, 赵政辉, 王苫社, 马思伟. 基于神经网络的图像视频编码[J]. 电信科学, 2019, 35(5): 32-42.
Chuanmin JIA, Zhenghui ZHAO, Shanshe WANG, Siwei MA. Neural network based image and video coding technologies[J]. Telecommunications Science, 2019, 35(5): 32-42.
[1] | SHANNON C E . A mathematical theory of communication[J]. Bell System Technical Journal, 1948,27(3): 379-423. |
[2] | BERGER T . Rate-distortion theory[J]. Wiley Encyclopedia of Telecommunications, 2003(4). |
[3] | BROFFERIO S , CAFFORIO C , RE P D ,et al. Redundancy reduction of video signals using movement compensation[J]. Alta Frequenza, 1974,43(10): 572-579. |
[4] | NETRAVALI A N , STULLER J A . Motion-compensated transform coding[J]. Bell System Technical Journal, 1979,58(7): 1703-1718. |
[5] | 周建同, 杨海涛, 刘东 ,等. 视频编码的技术基础及发展方向[J]. 电信科学, 2017,33(8): 16-25. |
ZHOU J T , YANG H T , LIU D ,et al. Trends and technologies of video coding[J]. Telecommunications Science, 2017,33(8): 16-25. | |
[6] | 张敏, 宋杰, 刘晓峰 . 电信运营商面对 OTT 的战略选择[J]. 电信科学, 2014,30(2): 142-146,151. |
ZHANG M , SONG J , LIU X F . Strategic selection of telecom operators to counter OTT[J]. Telecommunications Science, 2014,30(2): 142-146,151. | |
[7] | ROSENBLATT F . Principles of neurodynamics:perceptrons and the theory of brain mechanisms[R]. 1961. |
[8] | RUMELHART D E , HINTON G E , WILLIAMS R J . Learning representations by back-propagating errors[J]. Nature, 1986,323(6088):533. |
[9] | LECUN Y , BOTTOU L , BENGIO Y ,et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998,86(11): 2278-2324. |
[10] | HOCHREITER S , SCHMIDHUBER J . Long short-term memory[J]. Neural computation, 1997,9(8): 1735-1780. |
[11] | NAIR V , HINTON G E . Rectified linear units improve restricted boltzmann machines[C]// The 27th international conference on machine learning (ICML-10),June 21-24,2010,Haifa,Israel. Norristown:Omnipress, 2010: 807-814. |
[12] | CHUA L O , LIN T . A neural network approach to transform image coding[J]. International journal of circuit theory and applications,June 18-22, 1988,16(3): 317-324. |
[13] | SONEHARA N , . Image data compression using a neural network model[C]// International 1989 Joint Conference on Neural Networks,June 18-22,1989,Washington,DC,USA. Piscataway:IEEE Press, 1989. |
[14] | DIANAT S A , NASRABADI N M , VENKATARAMAN S . A non-linear predictor for differential pulse-code encoder (DPCM) using artificial neural networks[C]// International Conference on Acoustics,Speech,and Signal Processing,April 14-17,1991,Toronto,USA. Piscataway:IEEE Press, 1991: 2793-2796. |
[15] | MANIKOPOULOS C N . Neural network approach to DPCM system design for image coding[J]. IEEE Proceedings I (Communications,Speech and Vision), 1992,139(5): 501-507. |
[16] | GELENBE E . Random neural networks with negative and positive signals and product form solution[J]. Neural Computation, 1989,1(4): 502-510. |
[17] | CRAMER C , GELENBE E , BAKIRCIOGLU I . Video compression with random neural networks[C]// International Workshop on Neural Networks for Identification,Control,Robotics and Signal/Image Processing,August 23-26,1996,Venice,Italy. Piscataway:IEEE Press, 1996: 476-484. |
[18] | LECUN Y , BENGIO Y , HINTON G . Deep learning[J]. Nature, 2015,521(7553):436 |
[19] | BALLé J , LAPARRA V , SIMONCELLI E P . End-to-end optimized image compression[J]. arXiv:1611.01704, 2016. |
[20] | THEIS L , SHI W , CUNNINGHAM A ,et al. Lossy image compression with compressive autoencoders[J]. arXiv:1703.00395, 2017. |
[21] | AGUSTSSON E , MENTZER F , TSCHANNEN M ,et al. Soft-to-hard vector quantization for end-to-end learning compressible representations[J]. arXiv:1704.00648, 2017. |
[22] | BALLé J , MINNEN D , SINGH S ,et al. Variational image compression with a scale hyperprior[J]. arXiv:1802.01436, 2018. |
[23] | CHUNG J , GULCEHRE C , CHO K H ,et al. Empirical evaluation of gated recurrent neural networks on sequence modeling[J]. arXiv:1412.3555, 2014. |
[24] | TODERICI G , VINCENT D , JOHNSTON N ,et al. Full resolution image compression with recurrent neural networks[C]// The IEEE Conference on Computer Vision and Pattern Recognition,July 21-26,2017,Honolulu,USA. Piscataway:IEEE Press, 2017: 5306-5314. |
[25] | MINNEN D , TODERICI G , COVELL M ,et al. Spatially adaptive image compression using a tiled deep network[C]// 2017 IEEE International Conference on Image Processing (ICIP),Sept 17-20,2017,Beijing,China. Piscataway:IEEE Press, 2017: 2796-2800. |
[26] | GOODFELLOW I , POUGET-ABADIE J , MIRZA M ,et al. Generative adversarial nets[C]// Advances in neural information processing systems,Dec 8-13,2014,Montreal,Canada. Cambridge:MIT Press, 2014: 2672-2680. |
[27] | RIPPEL O , BOURDEV L . Real-time adaptive image compression[Z]. 2017. |
[28] | AGUSTSSON E , TSCHANNEN M , MENTZER F ,et al. Generative adversarial networks for extreme learned image compression[J]. arXiv:1804.02958, 2018. |
[29] | GREGOR K , BESSE F , REZENDE D J ,et al. Towards conceptual compression[C]// Advances in Neural Information Processing Systems,Dec 5-10,2016,Barcelona,Spain.[S.l.:s.n]. 2016: 3549-3557. |
[30] | CUI W , ZHANG T , ZHANG S ,et al. Convolutional neural networks based intra prediction for HEVC[J]. arXiv:1808.05734, 2018. |
[31] | LI J , LI B , XU J ,et al. Fully connected network-based intra prediction for image coding[J]. IEEE Transactions on Image Processing, 2018,27(7): 3236-3247. |
[32] | HU Y , YANG W , XIA S ,et al. Enhanced intra prediction with recurrent neural network in video coding[C]// 2018 Data Compression Conference,Mar 27-30,2018,Snowbird,USA. Piscataway:IEEE Press, 2018:413. |
[33] | LI Y , LIU D , LI H ,et al. Convolutional neural network-based block up-sampling for intra frame coding[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2018,28(9): 2316-2330. |
[34] | PFAFF J , HELLE P , MANIRY D ,et al. Neural network based intra prediction for video coding[J].2018. 2018. |
[35] | HUO S , LIU D , WU F ,et al. Convolutional neural network-based motion compensation refinement for video coding[C]// 2018 IEEE International Symposium on Circuits and Systems (ISCAS),May 26-29,2018,Sapporo,Japan. Piscataway:IEEE Press, 2018: 1-4. |
[36] | YAN N , LIU D , LI H ,et al. Convolutional neural network-based fractional-pixel motion compensation[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2018: 840-853. |
[37] | ZHAO L , WANG S , ZHANG X ,et al. Enhanced CTU-level inter prediction with deep frame rate up-conversion for high efficiency video coding[C]// 2018 25th IEEE International Conference on Image Processing (ICIP),Oct 7-10,2018,Athens,Greece. Piscataway:IEEE Press, 2018: 206-210. |
[38] | ZHAO Z , WANG S , WANG S ,et al. Enhanced bi-prediction with convolutional neural network for high efficiency video coding[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2018:1. |
[39] | SONG R , LIU D , LI H ,et al. Neural network-based arithmetic coding of intra prediction modes in HEVC[C]// 2017 IEEE Visual Communications and Image Processing (VCIP),Dec 10-13,2017,Petersburg,USA. Piscataway:IEEE Press, 2017: 1-4. |
[40] | MA C , LIU D , PENG X ,et al. Convolutional neural network-based arithmetic coding of DC coefficients for HEVC intra coding[C]// 2018 25th IEEE International Conference on Image Processing (ICIP),Oct 7-10,2018,Athens,Greece. Piscataway:IEEE Press, 2018: 1772-1776. |
[41] | ZHANG Y , SHEN T , JI X ,et al. Residual highway convolutional neural networks for in-loop filtering in HEVC[J]. IEEE Transactions on Image Processing, 2018,27(8): 3827-3841. |
[42] | JIA C , WANG S , ZHANG X ,et al. Spatial-temporal residue network based in-loop filter for video coding[C]// 2017 IEEE Visual Communications and Image Processing (VCIP),Nov 10-13,2017,St.Petersburg,USA. Piscataway:IEEE Press, 2017: 1-4. |
[43] | JIA C , WANG S , ZHANG X ,et al. Content-aware convolutional neural network for in-loop filtering in high efficiency video coding[J]. IEEE Transactions on Image Processing, 2019:1. |
[44] | DONG C , DENG Y , LOY C C ,et al. Compression artifacts reduction by a deep convolutional network[C]// the IEEE International Conference on Computer Vision,June 7-12,2015,Boston,USA. Piscataway:IEEE Press, 2015: 576-584 |
[45] | DAI Y , LIU D , WU F . A convolutional neural network approach for post-processing in HEVC intra coding[C]// International Conference on Multimedia Modeling,Jan 4-6,2017,Reykjavik,Iceland. Heidelberg:Springer, 2017: 28-39. |
[46] | YANG R , XU M , WANG Z . Decoder-side HEVC quality enhancement with scalable convolutional neural network[C]// 2017 IEEE International Conference on Multimedia and Expo (ICME),July 10-14,2017,Hongkong,China. Piscataway:IEEE Press, 2017: 817-822. |
[47] | WANG Z , WANG S , ZHANG X ,et al. Fast QTBT partitioning decision for interframe coding with convolution neural network[C]// 2018 25th IEEE International Conference on Image Processing (ICIP),Oct 7-10,2018,Athens,Greece. Piscataway:IEEE Press, 2018: 2550-2554. |
[48] | LIU Z , YU X , GAO Y ,et al. CU partition mode decision for HEVC hardwired intra encoder using convolution neural network[J]. IEEE Transactions on Image Processing, 2016,25(11): 5088-5103. |
[49] | XU M , LI T , WANG Z ,et al. Reducing complexity of HEVC:a deep learning approach[J]. IEEE Transactions on Image Processing, 2018,27(10): 5044-5059. |
[50] | XU B , PAN X , ZHOU Y ,et al. CNN-based rate-distortion modeling for H.265/HEVC[C]//2017 IEEE Visual Communications and Image Processing (VCIP),Dec 10-13,St.Petersburg,USA[C]// 2017 IEEE Visual Communications and Image Processing (VCIP),Dec 10-13,St.Petersburg,USA. Piscataway:IEEE Press, 2017: 1-4. |
[51] | CHEN T , LIU H , SHEN Q ,et al. Deepcoder:a deep neural network based video compression[C]// 2017 IEEE Visual Communications and Image Processing (VCIP),Dec 10-13,St.Petersburg,USA. Piscataway:IEEE Press, 2017: 1-4. |
[52] | CHEN Z , HE T , JIN X ,et al. Learning for video compression[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2019(14). |
[53] | WU C Y , SINGHAL N , KRAHENBUHL P . Video compression through image interpolation[C]// Proceedings of the European Conference on Computer Vision (ECCV),Sep 8-14,2018,Munich,Germany. Heidelberg:Springer, 2018: 416-431. |
[1] | 刘思聪, 苏丹萍, 卫天阔, 王先耀. 基于多节点协作的鲁棒可见光智能定位[J]. 电信科学, 2023, 39(5): 28-41. |
[2] | 卢敏, 胡娟, 张先超, 丁伟健, 乐光学. 基于用户多特征融合的个性化推荐模型[J]. 电信科学, 2023, 39(5): 101-115. |
[3] | 卢敏, 秦泽豪, 陈志辉, 张敏, 乐光学. 基于1D-Concatenate的信道估计DNN模型优化方法[J]. 电信科学, 2023, 39(4): 71-86. |
[4] | 刘璐, 杨丹, 陈睿杰, 李嘉, 周熹. 基于KPCA-GA-BP神经网络的POI质量预测研究[J]. 电信科学, 2023, 39(1): 108-116. |
[5] | 诸葛斌, 尹正虎, 斯文学, 颜蕾, 董黎刚, 蒋献. 基于学生知识追踪的多指标习题推荐算法[J]. 电信科学, 2022, 38(9): 129-143. |
[6] | 周杰, Esono Mikue Bernardo Esono, 王学英, 周惠婷, 罗宏. 基于SLM-PTS算法融合的NC-OFDM峰均比优化[J]. 电信科学, 2022, 38(7): 63-74. |
[7] | 李奕江, 叶会标, 谢仁华, 楼佳丽, 庄丹娜, 李传煌. 基于图神经网络的网络性能智能预测[J]. 电信科学, 2022, 38(3): 143-157. |
[8] | 李攀攀, 谢正霞, 乐光学, 刘鑫. 基于深度学习的无线通信接收方法研究进展与趋势[J]. 电信科学, 2022, 38(2): 1-17. |
[9] | 申情, 郭文宾, 楼俊钢, 余强国. 考虑多层次潜在特征的个性化推荐模型[J]. 电信科学, 2022, 38(2): 71-83. |
[10] | 祁伟, 殷海兵, 王鸿奎, 黄晓峰, 牛伟宏. 基于统计建模的VVC快速码率估计算法[J]. 电信科学, 2022, 38(12): 35-45. |
[11] | 赵海波, 相志军, 肖林松. 基于异构数据的电力短期负荷大数据预测方案[J]. 电信科学, 2022, 38(12): 103-111. |
[12] | 陈靓, 钱亚冠, 何志强, 关晓惠, 王滨, 王星. 深度卷积神经网络的柔性剪枝策略[J]. 电信科学, 2022, 38(1): 83-94. |
[13] | 陈志宏, 王明晓. 计算机视觉在智慧安防中的应用[J]. 电信科学, 2021, 37(8): 142-147. |
[14] | 赵进, 杨小军. 基于GRW和FastText模型的电信用户投诉文本分类应用[J]. 电信科学, 2021, 37(6): 125-131. |
[15] | 唐博恒, 柴鑫刚. 基于云边协同的计算机视觉推理机制[J]. 电信科学, 2021, 37(5): 72-81. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||
|