一种基于深度可分离卷积的VVC帧内编码快速块划分算法

doi:10.11959/j.issn.1000-0801.2023132

电信科学 ›› 2023, Vol. 39 ›› Issue (7): 99-108.doi: 10.11959/j.issn.1000-0801.2023132

一种基于深度可分离卷积的VVC帧内编码快速块划分算法

叶振¹, 王国相¹, 宋俊锋¹, 刘昊坤², 黎天送²

¹ 丽水学院，浙江丽水 323000
² 重庆师范大学，重庆 401331

修回日期:2023-06-13 出版日期:2023-07-20 发布日期:2023-07-01
作者简介:叶振（1984- ），男，博士，丽水学院计算机系讲师，主要研究方向为人工智能、视频图像处理
王国相（1989- ），男，丽水学院助教，主要研究方向为信号处理、人工智能、视频图像处理
宋俊锋（1984- ），男，丽水学院高级实验师，主要研究方向为虚拟现实、人工智能
刘昊坤（1999- ），男，重庆师范大学计算机与信息科学学院硕士生，主要研究方向为H.266/VVC视频编码、深度学习
黎天送（1987- ），男，博士，重庆师范大学计算机与信息科学学院讲师，主要研究方向为图像/视频编码、多视点视频编码、多媒体信号处理、人工智能
基金资助:
重庆市科技局自然基金项目(CSTB2022NSCQ-MSX1231);重庆市教委青年项目(KJQN202200519);重庆师范大学人才基金项目(21XLB031)

A fast block partitioning algorithm for VVC intra coding based on depthwise separable convolution

Zhen YE¹, Guoxiang WANG¹, Junfeng SONG¹, Haokun LIU², Tiansong LI²

¹ Lishui University, Lishui 323000, China
² Chongqing Normal University, Chongqing 401331, China

Revised:2023-06-13 Online:2023-07-20 Published:2023-07-01
Supported by:
The Natural Science Foundation Project of Chongqing Science and Technology Bureau(CSTB2022NSCQ-MSX1231);The Science and Technology Research Program of Chongqing Municipal Education Commission(KJQN202200519);The Talents Fund Project of Chongqing Normal University(21XLB031)

摘要/Abstract

摘要：

最近，联合视频探索工作组（JVET）将通用视频编码（VVC）作为新一代视频编码标准，它利用复杂的四叉树加多类型树（QTMTT）划分结构有效地提升了编码性能，但也导致编码复杂度急剧攀升，大幅地增加了编码时间。为解决上述问题，提出了一种基于深度可分离卷积的VVC帧内编码快速块划分算法，将编码单元（CU）的原始像素值作为输入，利用轻量化的深度可分离卷积神经网络提取CU纹理信息特征指导CU的划分模式选择，实现精准的划分模式预测。该方案通过跳过低概率的划分模式，减少CU划分模式的遍历，大幅地降低编码器的复杂度。实验结果表明，所提算法在VTM 15.2平台上实现了18%～48%的编码时间节省，仅仅带来了平均0.15%的性能损失，并且轻量化的深度可分离卷积计算带来的额外复杂性也可以忽略不计。

关键词: 视频编码, 深度学习, 帧内编码, 编码单元划分

Abstract:

The joint video exploration team (JVET) proposed versatile video coding (VVC) as a new video coding standard, and its quadtree plus multi-type tree (QTMTT) partition structure brings effective coding performance improvements.However, it brings about a sharp increase in encoding complexity, which greatly increases the encoding time.In order to solve the above problems, a fast block partitioning algorithm for VVC intra coding based on depthwise separable convolution was proposed.The pixel of coding unit (CU) was used as input, and the texture information feature of CU was extracted through depth-separable convolution.Therefore, accurate partition mode prediction was realized in the QTMT structure in VVC, and the complexity of the encoder was reduced by skipping low-probability partition modes.Experimental results show that the proposed algorithm saves 18% to 48% of encoding time on the VTM 15.2, and only brings an average performance loss of 0.15%.And the additional complexity brought by the lightweight depthwise separable convolution calculation is also negligible.

Key words: video coding, deep learning, intra coding, coding unit division

中图分类号:

TP391

叶振, 王国相, 宋俊锋, 刘昊坤, 黎天送. 一种基于深度可分离卷积的VVC帧内编码快速块划分算法[J]. 电信科学, 2023, 39(7): 99-108.

Zhen YE, Guoxiang WANG, Junfeng SONG, Haokun LIU, Tiansong LI. A fast block partitioning algorithm for VVC intra coding based on depthwise separable convolution[J]. Telecommunications Science, 2023, 39(7): 99-108.

图/表 7

图1

图2

图3

图4

表1

表2

图5

参考文献 22

[1]	SULLIVAN G J , OHM J R , HAN W J ,et al. Overview of the high efficiency video coding (HEVC) standard[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2012,22(12): 1649-1668.
[2]	BROSS B , WANG Y K , YE Y ,et al. Overview of the versatile video coding (VVC) standard and its applications[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2021,31(10): 3736-3764.
[3]	万帅, 杨付正 . 新一代高效视频编码H.265/HEVC:原理、标准与实现[M]. 北京: 电子工业出版社, 2014.
	WAN S , YANG F Z . New generation efficient video coding H.265/HEVC:principles,standards,and implementation[M]. Beijing: Publishing House of Electronics Industry, 2014.
[4]	XIANG M . CE3:CCLM/MDLM using simplified coefficients derivation method[EB]. 2018.
[5]	CHEN J , YE Y , KIM S ,et al. Algorithm description for versatile video coding and test model 5 (VTM 5)[EB]. 2019.
[6]	JIANLE C . Algorithm description for versatile video coding and test model 4 (VTM 4)[EB]. 2019.
[7]	SANTIAGO D L H . CE3:Intra sub-partitions coding mode[EB]. 2019.
[8]	JONATHAN P , BJ?RN S . CE3:affine linear weighted intra prediction[EB]. 2019.
[9]	卢嘉彬, 彭宗举, 束争杰 ,等. 面向 VVC 帧内编码的快速CU 划分和角度模式决策[J]. 光电子·激光, 2021,32(11): 1171-1179.
	LU J B , PENG Z J , SHU Z J ,et al. Fast CU partition and angle mode decision for VVC intra coding[J]. Journal of Optoelectronics · Laser, 2021,32(11): 1171-1179.
[10]	陶浩然, 路锦正, 李意弦 . 一种VVC帧内编码单元快速划分算法[J]. 小型微型计算机系统, 2021,42(7): 1470-1474.
	TAO H R , LU J Z , LI Y X . Fast division algorithm of VVC intra-coding unit[J]. Journal of Chinese Computer Systems, 2021,42(7): 1470-1474.
[11]	TANG N , CAO J , LIANG F ,et al. Fast CTU partition decision algorithm for VVC intra and inter coding[C]// Proceedings of 2019 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS). Piscataway:IEEE Press, 2020: 361-364.
[12]	DONG X C , SHEN L Q , YU M ,et al. Fast intra mode decision algorithm for versatile video coding[J]. IEEE Transactions on Multimedia, 2021,24: 400-414.
[13]	LI Y , YANG G B , SONG Y ,et al. Early intra CU size decision for versatile video coding based on a tunable decision model[J]. IEEE Transactions on Broadcasting, 2021,67(3): 710-720.
[14]	熊丹祺, 高伟, 滕国伟 . 基于决策树的 H.266/VVC 帧内预测CU快速划分算法[J]. 工业控制计算机, 2021,34(7): 88-90,92.
	XIONG D Q , GAO W , TENG G W . Decision tree accelerated CU partition algorithm for intra prediction in H.266/VVC[J]. Industrial Control Computer, 2021,34(7): 88-90,92.
[15]	ZHAO J C , WU A B , ZHANG Q W . SVM-based fast CU partition decision algorithm for VVC intra coding[J]. Electronics, 2022,11(14): 2147.
[16]	WU G Q , HUANG Y , ZHU C ,et al. SVM based fast CU partitioning algorithm for VVC intra coding[C]// Proceedings of 2021 IEEE International Symposium on Circuits and Systems (ISCAS). Piscataway:IEEE Press, 2021: 1-5.
[17]	LIU X G , LI Y Y , LIU D Y ,et al. An adaptive CU size decision algorithm for HEVC intra prediction based on complexity classification using machine learning[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2019,29(1): 144-155.
[18]	ZHANG S P , FENG S X , CHEN J W ,et al. A GCN-based fast CU partition method of intra-mode VVC[J]. Journal of Visual Communication and Image Representation, 2022,88:103621.
[19]	TECH G , PFAFF J , SCHWARZ H ,et al. Fast partitioning for VVC intra-picture encoding with a CNN minimizing the rate-distortion-time cost[C]// Proceedings of 2021 Data Compression Conference (DCC). Piscataway:IEEE Press, 2021: 3-12.
[20]	WU S L , SHI J , CHEN Z B . HG-FCN:hierarchical grid fully convolutional network for fast VVC intra coding[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2022,32(8): 5638-5649.
[21]	FANG J T , LIU B H , CHANG P C . Fast coding unit partitioning algorithms for versatile video coding intra coding[J]. Journal of Visual Communication and Image Representation, 2022,87:103542.
[22]	DANG-NGUYEN D T , PASQUINI C , CONOTTER V ,et al. RAISE:a raw images dataset for digital image forensics[C]// Proceedings of the 6th ACM Multimedia Systems Conference. New York:ACM Press, 2015: 219-224.

类别	视频序列	文献[18]方法		文献[21]方法		本文方法
类别	视频序列	BD-BR	TS	BD-BR	TS	BD-BR	TS
B	BasketballDrive	0.18%	13.41%	1.79%	33.78%	0.05%	27.05%
1 920 pixel×1 080 pixel	BQTerrace	0.38%	12.52%	0.58%	30.27%	0.02%	18.31%
	Cactus	0.27%	13.46%	0.81%	30.11%	0.03%	27.80%
C	BasketballDrill	0.12%	25.32%	0.92%	29.93%	0.11%	48.24%
832 pixel×480 pixel	BQMall	-0.03%	16.22%	1.09%	32.63%	0.11%	47.26%
	PartyScene	0.00%	17.97%	0.22%	25.50%	0.04%	35.18%
	RaceHorses	0.19%	17.30%	0.45%	31.64%	0.08%	46.38%
D	BasketballPass	0.32%	13.29%	1.13%	29.19%	0.14%	46.36%
416 pixel×240 pixel	BlowingBubbles	0.12%	14.42%	0.23%	23.93%	0.04%	40.55%
	BQSquare	0.32%	20.93%	0.08%	19.95%	0.02%	26.23%

类别	视频序列	本文方法
类别	视频序列	BD-BR	TS
B	BasketballDrive	0.05%	27.05%
1 920 pixel× 1 080 pixel	BQTerrace	0.02%	18.31%
	Cactus	0.03%	27.80%
	Kimono1	0.03%	27.48%
	ParkScene	0.03%	20.29%
C	BasketballDrill	0.11%	48.24%
832 pixel× 480 pixel	BQMall	0.11%	47.26%
	PartyScene	0.04%	35.18%
	RaceHorses	0.08%	46.38%
D	BasketballDrive	0.14%	46.36%
416 pixel× 240 pixel	BQTerrace	0.04%	40.55%
	Cactus	0.02%	26.23%
	Kimono1	0.06%	40.86%
E	BasketballDrive	0.15%	50.24%
1 280 pixel× 720 pixel	BQTerrace	0.24%	46.30%
	Cactus	0.14%	45.54%

一种基于深度可分离卷积的VVC帧内编码快速块划分算法

A fast block partitioning algorithm for VVC intra coding based on depthwise separable convolution

在线阅读

PDF下载

可视化

摘要/Abstract

引用本文

使用本文

图/表 7

参考文献 22

相关文章 15

Metrics

推荐阅读 0

[1]	张剑, 程俊华, 龚菡洁, 李红, 牛凯. 基于云边协同的高可用厨房卫生监控系统[J]. 电信科学, 2023, 39(Z1): 62-70.
[2]	祝谷乔, 姜超, 徐煜烨. 超分辨率重建技术及其在智能终端上的应用[J]. 电信科学, 2023, 39(7): 156-165.
[3]	卢敏, 胡娟, 张先超, 丁伟健, 乐光学. 基于用户多特征融合的个性化推荐模型[J]. 电信科学, 2023, 39(5): 101-115.
[4]	马稼明, 潘路平, 张琰琳. 基于Transformer的互联网暗链检测方法[J]. 电信科学, 2022, 38(Z2): 241-247.
[5]	诸葛斌, 尹正虎, 斯文学, 颜蕾, 董黎刚, 蒋献. 基于学生知识追踪的多指标习题推荐算法[J]. 电信科学, 2022, 38(9): 129-143.
[6]	周杰, Esono Mikue Bernardo Esono, 王学英, 周惠婷, 罗宏. 基于SLM-PTS算法融合的NC-OFDM峰均比优化[J]. 电信科学, 2022, 38(7): 63-74.
[7]	申情, 郭文宾, 楼俊钢, 余强国. 考虑多层次潜在特征的个性化推荐模型[J]. 电信科学, 2022, 38(2): 71-83.
[8]	李攀攀, 谢正霞, 乐光学, 刘鑫. 基于深度学习的无线通信接收方法研究进展与趋势[J]. 电信科学, 2022, 38(2): 1-17.
[9]	祁伟, 殷海兵, 王鸿奎, 黄晓峰, 牛伟宏. 基于统计建模的VVC快速码率估计算法[J]. 电信科学, 2022, 38(12): 35-45.
[10]	陈志宏, 王明晓. 计算机视觉在智慧安防中的应用[J]. 电信科学, 2021, 37(8): 142-147.
[11]	孙姝君, 彭盛亮, 姚育东, 杨喜. 基于深度学习的调制识别综述[J]. 电信科学, 2021, 37(5): 82-90.
[12]	唐博恒, 柴鑫刚. 基于云边协同的计算机视觉推理机制[J]. 电信科学, 2021, 37(5): 72-81.
[13]	彭双, 王晓东, 彭宗举, 陈芬. 基于深度学习的快速QTMT划分[J]. 电信科学, 2021, 37(4): 73-81.
[14]	胡道允, 齐进, 陆钱春, 李锋, 房红强. 基于深度学习的流量工程算法研究与应用[J]. 电信科学, 2021, 37(2): 107-114.
[15]	孟翔, 殷海兵, 黄晓峰. 基于统计建模的HEVC快速率失真估计算法[J]. 电信科学, 2021, 37(1): 58-68.