基于神经网络的图像视频编码

doi:10.11959/j.issn.1000-0801.2019142

摘要/Abstract

摘要：

深度神经网络近年来在人工智能领域进展显著，并引发广泛深入研究神经网络的热潮，近期基于神经网络的图像视频编码也成为热点研究问题之一。系统梳理了基于神经网络的图像视频编码技术及进展，对基于多层感知机、随机神经网络、卷积神经网络、循环神经网络、生成对抗网络等框架的图像压缩，以及基于深度学习的各类视频编码工具进行了综述介绍，同时对神经网络编码的未来发展趋势进行了分析与展望。

关键词: 神经网络, 深度学习, 卷积神经网络, 图像压缩, 视频编码

Abstract:

Deep neural networks have achieved tremendous success in artificial intelligence,which makes the broad and in-depth research of neural network resurge in recent years.Recently,the neural network based image and video coding has become one of the front-edge topics.A systematic and comprehensive review of neural network based image and video coding approaches based on network structure and coding modules were provided.The development of neural network based image compression,e.g.multi-layer perceptron,random neural network,convolutional neural network,recurrent neural network and generative adversarial network based image compression methods and neural network based video compression tools were introduced respectively.Moreover,the future trends in neural network based compression were also envisioned and discussed.

Key words: neural network, deep learning,CNN, image compression, video coding

中图分类号:

TP393

贾川民, 赵政辉, 王苫社, 马思伟. 基于神经网络的图像视频编码[J]. 电信科学, 2019, 35(5): 32-42.

Chuanmin JIA, Zhenghui ZHAO, Shanshe WANG, Siwei MA. Neural network based image and video coding technologies[J]. Telecommunications Science, 2019, 35(5): 32-42.

图/表 9

图1

图2

图3

图4

图5

图6

图7

图8

图9

参考文献 53

[1]	SHANNON C E . A mathematical theory of communication[J]. Bell System Technical Journal, 1948,27(3): 379-423.
[2]	BERGER T . Rate-distortion theory[J]. Wiley Encyclopedia of Telecommunications, 2003(4).
[3]	BROFFERIO S , CAFFORIO C , RE P D ,et al. Redundancy reduction of video signals using movement compensation[J]. Alta Frequenza, 1974,43(10): 572-579.
[4]	NETRAVALI A N , STULLER J A . Motion-compensated transform coding[J]. Bell System Technical Journal, 1979,58(7): 1703-1718.
[5]	周建同, 杨海涛, 刘东 ,等. 视频编码的技术基础及发展方向[J]. 电信科学, 2017,33(8): 16-25.
	ZHOU J T , YANG H T , LIU D ,et al. Trends and technologies of video coding[J]. Telecommunications Science, 2017,33(8): 16-25.
[6]	张敏, 宋杰, 刘晓峰 . 电信运营商面对 OTT 的战略选择[J]. 电信科学, 2014,30(2): 142-146,151.
	ZHANG M , SONG J , LIU X F . Strategic selection of telecom operators to counter OTT[J]. Telecommunications Science, 2014,30(2): 142-146,151.
[7]	ROSENBLATT F . Principles of neurodynamics:perceptrons and the theory of brain mechanisms[R]. 1961.
[8]	RUMELHART D E , HINTON G E , WILLIAMS R J . Learning representations by back-propagating errors[J]. Nature, 1986,323(6088):533.
[9]	LECUN Y , BOTTOU L , BENGIO Y ,et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998,86(11): 2278-2324.
[10]	HOCHREITER S , SCHMIDHUBER J . Long short-term memory[J]. Neural computation, 1997,9(8): 1735-1780.
[11]	NAIR V , HINTON G E . Rectified linear units improve restricted boltzmann machines[C]// The 27th international conference on machine learning (ICML-10),June 21-24,2010,Haifa,Israel. Norristown:Omnipress, 2010: 807-814.
[12]	CHUA L O , LIN T . A neural network approach to transform image coding[J]. International journal of circuit theory and applications,June 18-22, 1988,16(3): 317-324.
[13]	SONEHARA N , . Image data compression using a neural network model[C]// International 1989 Joint Conference on Neural Networks,June 18-22,1989,Washington,DC,USA. Piscataway:IEEE Press, 1989.
[14]	DIANAT S A , NASRABADI N M , VENKATARAMAN S . A non-linear predictor for differential pulse-code encoder (DPCM) using artificial neural networks[C]// International Conference on Acoustics,Speech,and Signal Processing,April 14-17,1991,Toronto,USA. Piscataway:IEEE Press, 1991: 2793-2796.
[15]	MANIKOPOULOS C N . Neural network approach to DPCM system design for image coding[J]. IEEE Proceedings I (Communications,Speech and Vision), 1992,139(5): 501-507.
[16]	GELENBE E . Random neural networks with negative and positive signals and product form solution[J]. Neural Computation, 1989,1(4): 502-510.
[17]	CRAMER C , GELENBE E , BAKIRCIOGLU I . Video compression with random neural networks[C]// International Workshop on Neural Networks for Identification,Control,Robotics and Signal/Image Processing,August 23-26,1996,Venice,Italy. Piscataway:IEEE Press, 1996: 476-484.
[18]	LECUN Y , BENGIO Y , HINTON G . Deep learning[J]. Nature, 2015,521(7553):436
[19]	BALLé J , LAPARRA V , SIMONCELLI E P . End-to-end optimized image compression[J]. arXiv:1611.01704, 2016.
[20]	THEIS L , SHI W , CUNNINGHAM A ,et al. Lossy image compression with compressive autoencoders[J]. arXiv:1703.00395, 2017.
[21]	AGUSTSSON E , MENTZER F , TSCHANNEN M ,et al. Soft-to-hard vector quantization for end-to-end learning compressible representations[J]. arXiv:1704.00648, 2017.
[22]	BALLé J , MINNEN D , SINGH S ,et al. Variational image compression with a scale hyperprior[J]. arXiv:1802.01436, 2018.
[23]	CHUNG J , GULCEHRE C , CHO K H ,et al. Empirical evaluation of gated recurrent neural networks on sequence modeling[J]. arXiv:1412.3555, 2014.
[24]	TODERICI G , VINCENT D , JOHNSTON N ,et al. Full resolution image compression with recurrent neural networks[C]// The IEEE Conference on Computer Vision and Pattern Recognition,July 21-26,2017,Honolulu,USA. Piscataway:IEEE Press, 2017: 5306-5314.
[25]	MINNEN D , TODERICI G , COVELL M ,et al. Spatially adaptive image compression using a tiled deep network[C]// 2017 IEEE International Conference on Image Processing (ICIP),Sept 17-20,2017,Beijing,China. Piscataway:IEEE Press, 2017: 2796-2800.
[26]	GOODFELLOW I , POUGET-ABADIE J , MIRZA M ,et al. Generative adversarial nets[C]// Advances in neural information processing systems,Dec 8-13,2014,Montreal,Canada. Cambridge:MIT Press, 2014: 2672-2680.
[27]	RIPPEL O , BOURDEV L . Real-time adaptive image compression[Z]. 2017.
[28]	AGUSTSSON E , TSCHANNEN M , MENTZER F ,et al. Generative adversarial networks for extreme learned image compression[J]. arXiv:1804.02958, 2018.
[29]	GREGOR K , BESSE F , REZENDE D J ,et al. Towards conceptual compression[C]// Advances in Neural Information Processing Systems,Dec 5-10,2016,Barcelona,Spain.[S.l.:s.n]. 2016: 3549-3557.
[30]	CUI W , ZHANG T , ZHANG S ,et al. Convolutional neural networks based intra prediction for HEVC[J]. arXiv:1808.05734, 2018.
[31]	LI J , LI B , XU J ,et al. Fully connected network-based intra prediction for image coding[J]. IEEE Transactions on Image Processing, 2018,27(7): 3236-3247.
[32]	HU Y , YANG W , XIA S ,et al. Enhanced intra prediction with recurrent neural network in video coding[C]// 2018 Data Compression Conference,Mar 27-30,2018,Snowbird,USA. Piscataway:IEEE Press, 2018:413.
[33]	LI Y , LIU D , LI H ,et al. Convolutional neural network-based block up-sampling for intra frame coding[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2018,28(9): 2316-2330.
[34]	PFAFF J , HELLE P , MANIRY D ,et al. Neural network based intra prediction for video coding[J].2018. 2018.
[35]	HUO S , LIU D , WU F ,et al. Convolutional neural network-based motion compensation refinement for video coding[C]// 2018 IEEE International Symposium on Circuits and Systems (ISCAS),May 26-29,2018,Sapporo,Japan. Piscataway:IEEE Press, 2018: 1-4.
[36]	YAN N , LIU D , LI H ,et al. Convolutional neural network-based fractional-pixel motion compensation[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2018: 840-853.
[37]	ZHAO L , WANG S , ZHANG X ,et al. Enhanced CTU-level inter prediction with deep frame rate up-conversion for high efficiency video coding[C]// 2018 25th IEEE International Conference on Image Processing (ICIP),Oct 7-10,2018,Athens,Greece. Piscataway:IEEE Press, 2018: 206-210.
[38]	ZHAO Z , WANG S , WANG S ,et al. Enhanced bi-prediction with convolutional neural network for high efficiency video coding[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2018:1.
[39]	SONG R , LIU D , LI H ,et al. Neural network-based arithmetic coding of intra prediction modes in HEVC[C]// 2017 IEEE Visual Communications and Image Processing (VCIP),Dec 10-13,2017,Petersburg,USA. Piscataway:IEEE Press, 2017: 1-4.
[40]	MA C , LIU D , PENG X ,et al. Convolutional neural network-based arithmetic coding of DC coefficients for HEVC intra coding[C]// 2018 25th IEEE International Conference on Image Processing (ICIP),Oct 7-10,2018,Athens,Greece. Piscataway:IEEE Press, 2018: 1772-1776.
[41]	ZHANG Y , SHEN T , JI X ,et al. Residual highway convolutional neural networks for in-loop filtering in HEVC[J]. IEEE Transactions on Image Processing, 2018,27(8): 3827-3841.
[42]	JIA C , WANG S , ZHANG X ,et al. Spatial-temporal residue network based in-loop filter for video coding[C]// 2017 IEEE Visual Communications and Image Processing (VCIP),Nov 10-13,2017,St.Petersburg,USA. Piscataway:IEEE Press, 2017: 1-4.
[43]	JIA C , WANG S , ZHANG X ,et al. Content-aware convolutional neural network for in-loop filtering in high efficiency video coding[J]. IEEE Transactions on Image Processing, 2019:1.
[44]	DONG C , DENG Y , LOY C C ,et al. Compression artifacts reduction by a deep convolutional network[C]// the IEEE International Conference on Computer Vision,June 7-12,2015,Boston,USA. Piscataway:IEEE Press, 2015: 576-584
[45]	DAI Y , LIU D , WU F . A convolutional neural network approach for post-processing in HEVC intra coding[C]// International Conference on Multimedia Modeling,Jan 4-6,2017,Reykjavik,Iceland. Heidelberg:Springer, 2017: 28-39.
[46]	YANG R , XU M , WANG Z . Decoder-side HEVC quality enhancement with scalable convolutional neural network[C]// 2017 IEEE International Conference on Multimedia and Expo (ICME),July 10-14,2017,Hongkong,China. Piscataway:IEEE Press, 2017: 817-822.
[47]	WANG Z , WANG S , ZHANG X ,et al. Fast QTBT partitioning decision for interframe coding with convolution neural network[C]// 2018 25th IEEE International Conference on Image Processing (ICIP),Oct 7-10,2018,Athens,Greece. Piscataway:IEEE Press, 2018: 2550-2554.
[48]	LIU Z , YU X , GAO Y ,et al. CU partition mode decision for HEVC hardwired intra encoder using convolution neural network[J]. IEEE Transactions on Image Processing, 2016,25(11): 5088-5103.
[49]	XU M , LI T , WANG Z ,et al. Reducing complexity of HEVC:a deep learning approach[J]. IEEE Transactions on Image Processing, 2018,27(10): 5044-5059.
[50]	XU B , PAN X , ZHOU Y ,et al. CNN-based rate-distortion modeling for H.265/HEVC[C]//2017 IEEE Visual Communications and Image Processing (VCIP),Dec 10-13,St.Petersburg,USA[C]// 2017 IEEE Visual Communications and Image Processing (VCIP),Dec 10-13,St.Petersburg,USA. Piscataway:IEEE Press, 2017: 1-4.
[51]	CHEN T , LIU H , SHEN Q ,et al. Deepcoder:a deep neural network based video compression[C]// 2017 IEEE Visual Communications and Image Processing (VCIP),Dec 10-13,St.Petersburg,USA. Piscataway:IEEE Press, 2017: 1-4.
[52]	CHEN Z , HE T , JIN X ,et al. Learning for video compression[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2019(14).
[53]	WU C Y , SINGHAL N , KRAHENBUHL P . Video compression through image interpolation[C]// Proceedings of the European Conference on Computer Vision (ECCV),Sep 8-14,2018,Munich,Germany. Heidelberg:Springer, 2018: 416-431.