基于深度学习的快速QTMT划分

doi:10.11959/j.issn.1000-0801.2021062

Abstract

Abstract:

Compared with the predecessor standards, versatile video coding (VVC) significantly improves compression efficiency by a quadtree with nested multi-type tree (QTMT) structure but at the expense of extremely high coding complexity.To reduce the coding complexity of VVC, a fast QTMT partition method was proposed based on deep learning.Firstly, an attention-asymmetric convolutional neural network was proposed to predict the probability of partition modes.Then, the fast decision of partition modes based on the threshold was proposed.Finally, the cost of coding performance and time was proposed to obtain the optimal threshold, and the threshold decision method was proposed.Experimental results at different levels show that the proposed method achieves an average time saving of 48.62%/52.93%/62.01% with the negligible BDBR of 1.05%/1.33%/2.38%.Such results demonstrate that the proposed method significantly outperforms other state-of-the-art methods.

Key words: VVC, QTMT, fast partition decision, deep learning

CLC Number:

TP393

Shuang PENG, Xiaodong WANG, Zongju PENG, Fen CHEN. Fast QTMT partition decision based on deep learning[J]. Telecommunications Science, 2021, 37(4): 73-81.

Figures/Tables 9

类	序列	本文所提算法（ω=0.85）		Lei等^[11]所提算法		Fu等^[23]所提算法		本文所提算法（ω=1）		Yang等^[22]所提算法		本文所提算法（ω=∞）
类	序列	BD-BR	ATR	BD-BR	ATR	BD-BR	ATR	BD-BR	ATR	BD-BR	ATR	BD-BR	ATR
A1	Tango2	1.19%	40.19%	0.97%	46.80%	1.72%	54.00%	1.52%	44.57%	0.99%	50.98%	2.96%	57.00%
	FoodMarket4	0.73%	26.41%	0.59%	45.90%	1.18%	53.00%	0.84%	27.29%	0.62%	53.53%	2.62%	44.55%
	Campfire*	1.29%	44.37%	0.81%	42.10%	1.04%	46.00%	1.70%	50.92%	0.95%	49.67%	2.84%	59.00%
A2	Catrobot1	1.37%	41.47%	1.19%	41.60%	1.51%	43.00%	1.85%	48.21%	1.05%	44.75%	3.08%	58.43%
	DaylightRoad2	1.33%	50.09%	1.17%	45.40%	1.53%	49.00%	1.84%	55.84%	0.43%	60.83%	2.81%	66.12%
	ParkRunning3*	1.34%	32.57%	0.52%	31.50%	0.54%	51.00%	1.60%	39.18%	0.82%	53.67%	2.37%	52.95%
B	MarketPlace	1.40%	45.40%	0.88%	45.10%	0.76%	52.00%	1.61%	51.18%	—	—	2.99%	64.30%
	RitualDance	1.60%	44.06%	1.10%	45.60%	1.14%	45.00%	2.18%	51.29%	—	—	3.74%	59.89%
	Cactus*	1.32%	51.16%	1.02%	41.90%	0.96%	44.00%	1.60%	56.53%	1.95%	56.66%	2.75%	65.70%
C	BasketballDrive	1.26%	51.75%	0.82%	47.20%	0.95%	49.00%	1.38%	54.86%	2.25%	64.01%	2.75%	64.87%
	BQTerrace	1.51%	54.52%	1.47%	45.50%	0.79%	48.00%	1.96%	59.63%	2.07%	56.07%	2.71%	66.99%
	BasketballDrill	1.54%	48.93%	0.82%	45.70%	1.89%	46.00%	1.84%	52.83%	2.01%	48.19%	3.56%	63.80%
	BQMall*	0.67%	57.70%	0.89%	46.90%	0.87%	39.00%	0.85%	61.26%	2.15%	55.23%	1.59%	67.43%
D	PartyScene	0.54%	56.58%	0.63%	44.20%	0.46%	42.00%	0.63%	58.53%	0.60%	45.73%	1.11%	65.08%
	RaceHorsesC	0.83%	54.58%	2.03%	56.20%	0.71%	42.00%	1.05%	59.05%	1.16%	48.39%	1.77%	65.88%
	BasketballPass*	0.71%	53.90%	0.99%	44.40%	0.70%	43.00%	0.79%	54.25%	2.33%	45.85%	1.65%	61.37%
E	Bqsquare	0.61%	56.22%	0.89%	43.10%	0.47%	44.00%	0.75%	58.24%	0.81%	46.06%	1.32%	64.18%
	BlowingBubbles	0.70%	50.63%	0.91%	47.10%	0.56%	37.00%	0.82%	53.57%	0.77%	41.56%	1.46%	60.41%
	RaceHorses	0.75%	48.76%	1.07%	42.00%	0.75%	39.00%	1.01%	52.44%	0.86%	43.17%	1.76%	59.62%
	FourPeople	0.84%	52.84%	1.42%	51.10%	1.37%	41.00%	1.00%	57.86%	2.75%	57.64%	2.08%	67.83%
	Johnny*	0.92%	53.74%	1.35%	46.20%	1.33%	39.00%	1.54%	60.83%	3.29%	58.98%	2.47%	65.32%
	KristenAndSara	0.74%	53.74%	1.20%	46.30%	1.24%	40.00%	0.99%	56.04%	2.51%	59.19%	1.94%	63.43%
	$A v e r a g e$	$1 . 05 %$	$48 . 62 %$	$1 . 03 %$	$45 . 08 %$	$1 . 02 %$	$44 . 82 %$	$1 . 33 %$	$52 . 93 %$	$1 . 52 %$	$52 . 01 %$	$2 . 38 %$	$62 . 01 %$
	$S T D$	$0 . 32 %$	$5 . 99 %$	$0 . 25 %$	$2 . 71 %$	$0 . 34 %$	$4 . 07 %$	$0 . 42 %$	$5 . 31 %$	$0 . 76 %$	$5 . 57 %$	$0 . 61 %$	$4 . 08 %$

References 31

[1]	JCT-VC. High efficiency video coding (HEVC) text specification draft 10:JCTVC-L1003[S]. 2013.
[2]	JVET. Meeting report of the 10th JVET meeting:JVET-J1000[S]. 2018.
[3]	JVET. Algorithm description for versatile video coding and test model 2:JEVT-K1002[S]. 2018.
[4]	周芸, 胡潇, 郭晓强 . H.266/VVC视频编码图像划分技术研究[J]. 广播与电视技术, 2019,46(11): 40-44.
	ZHOU Y , HU X , GUO X Q . [J]. Research on image partition technolo-gy in H.266/VVC, 2019,46(11): 40-44.
[5]	JVET. AHG report:test model software development (AHG3):JVET-J0003[S]. 2018.
[6]	PAKDAMAN F , ADELIMANESH M A , GABBOUJ M ,et al. Complexity analysis of next-generation VVC encoding and decod-ing[EB]. 2020.Arxiv:2005.10801.
[7]	姚英彪, 李晓娟 . 基于图像空间相关性与纹理的HEVC块划分快速算法[J]. 电信科学, 2015,31(1): 38-46.
	YAO Y B , LI X J . Fast block partitioning algorithm for HEVC based on spatial correlation and image texture[J]. Telecommunica-tions Science, 2015,31(1): 38-46.
[8]	KUO Y , CHEN P , LIN H . A spatiotemporal content-based CU size decision algorithm for HEVC[J]. IEEE Transactions on Broadcast-ing, 2020,1(66): 100-112.
[9]	JAMALI M , COULOMBE S . Fast HEVC intra mode decision base on RDO cost prediction[J]. IEEE Transactions on Broadcasting, 2019,1(65): 109-122.
[10]	HUANG B , CHEN Z , CAI Q ,et al. Rate-distortion-complexity optimized coding mode decision for HEVC[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2020,3(30): 795-809.
[11]	LEI M , LUO F , ZHANG X ,et al. Look-ahead prediction based coding unit size pruning for VVC intra coding[C]// Proceedings of IEEE International Conference on Image Processing. Piscataway:IEEE Press, 2019: 4120-4124.
[12]	CHEN J , SUN H , KATTO J ,et al. Fast QTMT partition decision algorithm in VVC intra coding based on variance and gradient[C]// Proceedings of IEEE Visual Communications and Image Processing. Piscataway:IEEE Press, 2019: 1-4.
[13]	FAN Y , CHEN J , SUN H ,et al. A fast QTMT partition decision strategy for VVC intra prediction[J]. IEEE Access, 2020(8): 107900-107911.
[14]	PARK S , KANG J . Context-based ternary tree decision method in versatile video coding for fast intra coding[J]. IEEE Access, 2019(7): 172597-172605.
[15]	PARK S , KANG J . Fast affine motion estimation for versatile video coding (VVC) encoding[J]. IEEE Access, 2019(7): 158075-158084.
[16]	LIU X , LI Y , LIU D ,et al. An adaptive CU size decision algorithm for HEVC intra prediction based on complexity classification using machine learning[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2019,1(29): 144-155.
[17]	CHEN Z , SHI J , LI W . Learned fast HEVC intra coding[J]. IEEE Transactions on Image Processing, 2020(29): 5431-5446.
[18]	KATAYAMA T , KURODA K , SHI W ,et al. Low-complexity intra coding algorithm based on convolutional neural network for HEVC[C]// Proceedings of International Conference on Information and Computer Technologies. Piscataway:IEEE Press, 2018: 115-118.
[19]	KIM K , RO W W . Fast CU depth decision for HEVC using neural networks[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2019,5(29): 1462-1473.
[20]	XU M , LI T Y , WANG Z ,et al. Reducing complexity of HEVC:a deep learning approach[J]. IEEE Transactions on Image Processing, 2018,10(27): 5044-5059.
[21]	TANG G , JING M , ZENG X ,et al. Adaptive CU split decision with pooling-variable CNN for VVC intra encoding[C]// Proceedings of IEEE Visual Communications and Image Processing. Piscataway:IEEE Press, 2019: 1-4.
[22]	YANG H , SHEN L , DONG X ,et al. Low-complexity CTU partition structure decision and fast intra mode decision for versatile video coding[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2020,6(30): 1668-1682.
[23]	FU T , ZHANG H , MU F.Fast CU partitioning algorithm for H . 266/VVC intra-frame coding[C]// Proceedings of IEEE International Conference on Multimedia and Expo. Piscataway:IEEE Press, 2019: 55-60.
[24]	AMESTOY T , MERCAT A , HAMIDOUCHE W ,et al. Tunable VVC frame partitioning based on lightweight machine learning[J]. IEEE Transactions on Image Processing, 2020(29): 1313-1328.
[25]	贾川民, 赵政辉, 王苫社 ,等. 基于神经网络的图像视频编码[J]. 电信科学, 2019,35(5): 32-42.
	JIA C M , ZHAO Z H , WANG S S ,et al. Neural network based im-age and video coding technologies[J]. Telecommunications Science, 2019,35(5): 32-42.
[26]	WIECKOWSKI A , MA J , SCHWARZ H ,et al. Fast partitioning decision strategies for the upcoming versatile video coding (VVC) standard[C]// Proceedings of IEEE International Conference on Image Processing. Piscataway:IEEE Press, 2019: 4130-4134.
[27]	HU J , SHEN L , SUN G . Squeeze-and-excitation networks[C]// Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2018: 7132-7141.
[28]	CHEN Y P , DAI X Y , LIU M C . Dynamic convolution:attention over convolution kernels[C]// Proceedings of IEEE/CVF Conference on Computer Vision and Pattern RecognitionA. Piscataway:IEEE Press, 2020: 11030-11039.
[29]	JVET. Algorithm description for versatile video coding and test model 7:JEVT-P2002[S]. 2019.
[30]	JVET. JVET common test conditions and software reference configurations:JEVT-K1010[S]. 2018.
[31]	VCEG. Calculation of average PSNR differences between RD curves:VCEG-M33[S]. 2001.

Metrics

Recommended 0

No Suggested Reading articles found!

序列		R-square
序列	α=μ?φ+ν	n=a?b^φ+c	n=a?b^φ+c?d^φ
Campfire	0.998 3	0.987 2	0.997 8
DaylightRoad2	0.999 5	0.989 6	0.998 8
Cactus	0.998 9	0.989 9	0.998 7
BasketballDrill	0.997 1	0.984 3	0.998 2
BasketballPass	0.982 8	0.966 1	0.995 5
FourPeople	0.999 0	0.983 2	0.997 0

序列	μ	a	b	c	d
Campfire	0.60	0.58	1.88	8.46×10^-11	23.66
DaylightRoad2	0.61	0.57	1.88	2.63×10^-8	18.06
Cactus	0.64	0.60	1.88	4.36×10^-10	21.93
BasketballDrill	0.62	0.65	1.84	2.43×10^-8	17.55
BasketballPass	0.46	0.59	1.63	1.50×10^-14	32.44
FourPeople	0.56	0.54	1.83	2.75×10^-10	22.66

Fast QTMT partition decision based on deep learning

RichHTML

PDF下载

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 9

References 31

Related Articles 15

Metrics

Recommended 0

[1]	Min LU, Juan HU, Xianchao ZHANG, Weijian DING, Guangxue YUE. Personalized recommendation model based on users multi-features fusion [J]. Telecommunications Science, 2023, 39(5): 101-115.
[2]	Yiyin GU, Hongkui WANG, Haibin YIN. VVC coded distortion prediction model based on frame-level transform coefficient modeling of generalized Gaussian distribution [J]. Telecommunications Science, 2023, 39(4): 101-110.
[3]	Bin ZHUGE, Zhenghu YIN, Wenxue SI, Lei YAN, Ligang DONG, Xian JIANG. Student knowledge tracking based multi-indicator exercise recommendation algorithm [J]. Telecommunications Science, 2022, 38(9): 129-143.
[4]	Jie ZHOU, Bernardo Esono Esono Mikue, Xueying WANG, Huiting ZHOU, Hong LUO. PAPR optimization based on SLM and PTS algorithms in NC-OFDM systems [J]. Telecommunications Science, 2022, 38(7): 63-74.
[5]	Panpan LI, Zhengxia XIE, Guangxue YUE, Xin LIU. Research progress and trends of deep learning based wireless communication receiving method [J]. Telecommunications Science, 2022, 38(2): 1-17.
[6]	Qing SHEN, Wenbin GUO, Jungang LOU, Qiangguo YU. Personalized recommendation model with multi-level latent features [J]. Telecommunications Science, 2022, 38(2): 71-83.
[7]	Wei QI, Haibing YIN, Hongkui WANG, Xiaofeng HUANG, Weihong NIU. Statistical modeling based fast rate estimation algorithm for VVC [J]. Telecommunications Science, 2022, 38(12): 35-45.
[8]	Zhihong CHEN, Mingxiao WANG. Application of computer vision in intelligent security [J]. Telecommunications Science, 2021, 37(8): 142-147.
[9]	Boheng TANG, Xingang CHAI. Cloud-edge collaboration based computer vision inference mechanism [J]. Telecommunications Science, 2021, 37(5): 72-81.
[10]	Shujun SUN, Shengliang PENG, Yudong YAO, Xi YANG. A survey of deep learning based modulation recognition [J]. Telecommunications Science, 2021, 37(5): 82-90.
[11]	Daoyun HU, Jin QI, Qianchun LU, Feng LI, Hongqiang FANG. Research and application of traffic engineering algorithm based on deep learning [J]. Telecommunications Science, 2021, 37(2): 107-114.
[12]	Jie ZHANG, Lihua YANG, Zenghao WANG, Bo HU, Qian NIE. A novel deep learning based time-varying channel prediction method [J]. Telecommunications Science, 2021, 37(1): 39-47.
[13]	Yuanning LI,Baifeng NING,Zhaojie DONG. Patrol image analysis framework and deep learning method for power grid [J]. Telecommunications Science, 2020, 36(8): 167-174.
[14]	Tingting ZHANG,Jianwu ZHANG,Chunsheng GUO,Huahua CHEN,Di ZHOU,Yansong WANG,Aihua XU. A survey of image object detection algorithm based on deep learning [J]. Telecommunications Science, 2020, 36(7): 92-106.
[15]	Rui GUO,Fanchun RAN. Polar codes decoding algorithm based on convolutional neural network [J]. Telecommunications Science, 2020, 36(6): 119-124.