通信学报 ›› 2022, Vol. 43 ›› Issue (2): 143-155.doi: 10.11959/j.issn.1000-436x.2022031
霍俊彦1, 王丹妮1, 马彦卓1, 万帅2, 杨付正1
修回日期:
2022-01-24
出版日期:
2022-02-25
发布日期:
2022-02-01
作者简介:
霍俊彦(1982-),女,山西晋中人,博士,西安电子科技大学副教授,主要研究方向为多媒体通信、视频编码、智能信息处理基金资助:
Junyan HUO1, Danni WANG1, Yanzhuo MA1, Shuai WAN2, Fuzheng YANG1
Revised:
2022-01-24
Online:
2022-02-25
Published:
2022-02-01
Supported by:
摘要:
新一代视频编码标准H.266/VVC引入分量间线性模型(CCLM)预测提高压缩效率。针对亮度色度分量存在相关性却难以建模的问题,提出基于神经网络的分量间预测算法。该算法根据待预测像素与参考像素的亮度差遴选出相关性强的参考像素构成参考子集,然后将参考子集送入轻量级全连接网络获得色度预测值。实验结果表明,与 H.266/VVC 测试模型版本 10.0(VTM10.0)相比,所提算法可提高色度预测准确度,在 Y、Cb 和 Cr上可分别节省0.27%、1.54%和1.84%的码率。所提算法具有不同块尺寸和编码参数均可使用统一网络结构的优点。
中图分类号:
霍俊彦, 王丹妮, 马彦卓, 万帅, 杨付正. 基于轻量级全连接网络的H.266/VVC分量间预测[J]. 通信学报, 2022, 43(2): 143-155.
Junyan HUO, Danni WANG, Yanzhuo MA, Shuai WAN, Fuzheng YANG. Efficient cross-component prediction for H.266/VVC based on lightweight fully connected networks[J]. Journal on Communications, 2022, 43(2): 143-155.
表4
NNCCP与VTM10.0编码性能比较"
类 | 序列 | Y | Cb | Cr | YCbCr |
Tango2 | -0.66% | -4.48% | -4.93% | -1.12% | |
A1 | FoodMarket4 | -0.21% | -1.22% | -1.80% | -0.43% |
Campfire | -1.30% | 0.14% | -4.23% | -1.24% | |
CatRobot | -0.34% | -2.06% | -2.36% | -0.65% | |
A2 | DaylightRoad2 | -0.04% | -1.58% | -1.10% | -0.14% |
ParkRunning3 | -0.04% | -0.45% | -0.41% | -0.27% | |
MarketPlace | -0.45% | -2.83% | -1.72% | -0.77% | |
RitualDance | -0.30% | -2.13% | -3.76% | -0.64% | |
B | Cactus | -0.08% | -0.94% | -0.76% | -0.19% |
BasketballDrive | -0.12% | -1.54% | -1.49% | -0.29% | |
BQTerrace | -0.04% | -1.70% | -1.53% | -0.13% | |
BasketballDrill | -0.78% | -3.69% | -3.48% | -1.22% | |
C | BQMall | -0.15% | -1.44% | -1.42% | -0.33% |
PartyScene | -0.12% | -1.15% | -1.14% | -0.25% | |
RaceHorses | -0.15% | -0.63% | -1.03% | -0.26% | |
FourPeople | -0.02% | -0.59% | -0.49% | -0.08% | |
E | Johnny | -0.03% | -0.52% | -0.72% | -0.11% |
KristenAndSara | -0.05% | -0.89% | -0.77% | -0.16% | |
所有序列的平均结果 | -0.27% | -1.54% | -1.84% | -0.46% |
表6
测试序列在不同编码QP下选中NNCCP模式的像素数占比"
类 | 序列 | QP | ||||
22 | 27 | 32 | 37 | |||
Tango2 | 20.59% | 25.12% | 27.89% | 27.93% | ||
A1 | FoodMarket4 | 11.49% | 13.77% | 15.31% | 14.66% | |
Campfire | 19.39% | 28.11% | 30.94% | 28.26% | ||
CatRobot | 13.46% | 15.64% | 16.59% | 17.27% | ||
A2 | DaylightRoad2 | 10.40% | 8.89% | 8.66% | 8.84% | |
ParkRunning3 | 13.47% | 14.08% | 14.30% | 12.69% | ||
MarketPlace | 21.53% | 27.19% | 29.38% | 26.54% | ||
RitualDance | 17.05% | 21.02% | 22.05% | 19.93% | ||
B | Cactus | 12.37% | 13.31% | 13.76% | 13.76% | |
BasketballDrive | 8.88% | 9.43% | 9.25% | 8.82% | ||
BQTerrace | 15.32% | 14.50% | 13.10% | 11.58% | ||
BasketballDrill | 20.46% | 26.18% | 30.63% | 32.03% | ||
C | BQMall | 15.06% | 15.41% | 14.30% | 13.44% | |
PartyScene | 17.15% | 19.85% | 18.99% | 16.57% | ||
RaceHorses | 6.15% | 8.54% | 13.75% | 15.10% | ||
FourPeople | 5.10% | 6.58% | 8.16% | 8.73% | ||
E | Johnny | 5.82% | 5.10% | 4.34% | 4.57% | |
KristenAndSara | 6.79% | 5.84% | 6.82% | 6.90% | ||
所有序列的平均结果 | 13.36% | 15.48% | 16.57% | 15.98% |
[1] | ITU-T. ITU-T Recommendation H.266 and ISO/IEC 23090-3 VVC standard[S]. 2020. |
[2] | ALBRECHT M , BARTNIK C . Description of SDR,HDR,and 360° video coding technology proposal by Fraunhofer HHI[R]. JVET-J0014, 2018. |
[3] | YE Y , BOYCE J M , HANHART P . Omnidirectional 360° video coding technology in responses to the joint call for proposals on video compression with capability beyond HEVC[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2020,30(5): 1241-1252. |
[4] | FRAN?OIS E , SEGALL C A , TOURAPIS A M ,et al. High dynamic range video coding technology in responses to the joint call for proposals on video compression with capability beyond HEVC[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2020,30(5): 1253-1266. |
[5] | ITU-T. ITU-T Recommendation H.265 and ISO/IEC 23008-2 HEVC standard.High efficiency video coding[S]. 2013. |
[6] | BROSS B , CHEN J L , OHM J R ,et al. Developments in international video coding standardization after AVC,with an overview of versatile video coding (VVC)[J]. Proceedings of the IEEE, 2021,109(9): 1463-1493. |
[7] | 朱秀昌, 唐贵进 . H.266/VVC:新一代通用视频编码国际标准[J]. 南京邮电大学学报(自然科学版), 2021,41(2): 1-11. |
ZHU X C , TANG G J.H . 266/VVC:versatile video coding international standard[J]. Journal of Nanjing University of Posts and Telecommunications (Natural Science), 2021,41(2): 1-11. | |
[8] | HUANG Y W , HSU C W , CHEN C Y ,et al. A VVC proposal with quaternary tree plus binary-ternary tree coding block structure and advanced coding techniques[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2020,30(5): 1311-1325. |
[9] | SCH?FER M , STALLENBERGER B , PFAFF J ,et al. Efficient fixed-point implementation of matrix-based intra prediction[C]// Proceedings of 2020 IEEE International Conference on Image Processing. Piscataway:IEEE Press, 2020: 3364-3368. |
[10] | PFAFF J , SCHWARZ H , MARPE D ,et al. Video compression using generalized binary partitioning,trellis coded quantization,perceptually optimized encoding,and advanced prediction and transform coding[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2019,30(5): 1281-1295. |
[11] | LEE S H , CHO N I . Intra prediction method based on the linear relationship between the channels for YUV 4:2:0 intra coding[C]// Proceedings of 2009 16th IEEE International Conference on Image Processing. Piscataway:IEEE Press, 2009: 1037-1040. |
[12] | ZHANG K , CHEN Y W , ZHANG L ,et al. An improved framework of affine motion compensation in video coding[J]. IEEE Transactions on Image Processing, 2019,28(3): 1456-1469. |
[13] | GAO H , ESENLIK S , ALSHINA E ,et al. Geometric partitioning mode in versatile video coding:algorithm review and analysis[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2021,31(9): 3603-3617. |
[14] | NASER K , POIRIER T , LEANNEC L F . Non-CE6:shape adaptive transform selection for ISP,SBT and MTS[R]. JVET-N0388-v5, 2019. |
[15] | KOO M , SALEHIFAR M , LIM J ,et al. Low frequency non-separable transform (LFNST)[C]// Proceedings of 2019 Picture Coding Symposium (PCS). Piscataway:IEEE Press, 2019: 1-5. |
[16] | TSAI C Y , CHEN C Y , YAMAKAGE T ,et al. Adaptive loop filtering for video coding[J]. IEEE Journal of Selected Topics in Signal Processing, 2013,7(6): 934-945. |
[17] | HE K M , ZHANG X Y , REN S Q ,et al. Deep residual learning for image recognition[C]// Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2016: 770-778. |
[18] | REN S Q , HE K M , GIRSHICK R ,et al. Faster R-CNN:towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017,39(6): 1137-1149. |
[19] | KIM J , LEE J K , LEE K M . Accurate image super-resolution using very deep convolutional networks[C]// Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2016: 1646-1654. |
[20] | LIU D , LI Y , LIN J P ,et al. Deep learning-based video coding[J]. ACM Computing Surveys, 2021,53(1): 1-35. |
[21] | MA S W , ZHANG X F , JIA C M ,et al. Image and video compression with neural networks:a review[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2020,30(6): 1683-1698. |
[22] | MINNEN D , BALLé J , TODERICI G . Joint autoregressive and hierarchical priors for learned image compression[J]. arXiv Preprint,arXiv:1809.02736, 2018. |
[23] | CHEN J , YE Y , KIM S . Algorithm description for versatile video coding and test model 10 (VTM 10)[R]. JVET-S2002, 2020. |
[24] | CHIEN W J , BOYCE J , CHEN Y W ,et al. JVET AHG report:tool reporting procedure (AHG13)[R]. JVET-T0013, 2020. |
[25] | ZHANG K , CHEN J , ZHANG L ,et al. Enhanced cross-component linear model for chroma intra-prediction in video coding[J]. IEEE Transactions on Image Processing, 2018,27(8): 3983-3997. |
[26] | MA X , YANG H , CHEN J . Tests of cross-component linear model in BMS1.0[R]. JVET-K0190, 2018. |
[27] | MA X , YANG H , CHEN J . CE3:CCLM/MDLM using simplified coefficients derivation method (Test 5.6.1,5.6.2 and 5.6.3)[R]. JVET-L0340, 2018. |
[28] | LAROCHE G , TAQUET J , GISQUET C ,et al. CE3:cross-component linear model simplification (Test 5.1)[R]. JVET-L0191, 2018. |
[29] | HUO J Y , MA Y Z , WAN S ,et al. CE3-1.5:CCLM derived from four neighbouring samples[R]. JVET-N0271, 2019. |
[30] | BLANCH M G , BLASI S , SMEATON A ,et al. Chroma intra prediction with attention-based CNN architectures[C]// Proceedings of 2020 IEEE International Conference on Image Processing. Piscataway:IEEE Press, 2020: 783-787. |
[31] | ZHU L W , ZHANG Y , WANG S Q ,et al. Deep learning-based chroma prediction for intra versatile video coding[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2021,31(8): 3168-3181. |
[32] | LI Y , LI L , LI Z ,et al. A hybrid neural network for chroma intra prediction[C]// Proceedings of 2018 25th IEEE International Conference on Image Processing. Piscataway:IEEE Press, 2018: 1797-1801. |
[33] | TIMOFTE R , AGUSTSSON E , GOOL L V ,et al. NTIRE 2017 challenge on single image super-resolution:methods and results[C]// Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops. Piscataway:IEEE Press, 2017: 1110-1121. |
[34] | BOYCE J , SUEHRING K , LI L ,et al. JVET common test conditions and software reference configurations[R]. JVET-J1010, 2018. |
[35] | BOSSEN F . On reporting combined YUV BD rates[R]. JVET-N0341, 2019. |
[1] | 陈晋音, 熊海洋, 马浩男, 郑雅羽. 基于对比学习的图神经网络后门攻击防御方法[J]. 通信学报, 2023, 44(4): 154-166. |
[2] | 李建锋, 刘哲宇, 荣洋, 李展, 廖柏林, 屈林曦, 刘志杰, 林琨煌. 用于线性噪声时变凸二次规划的归零神经网络[J]. 通信学报, 2023, 44(4): 226-233. |
[3] | 林云, 徐怀韬, 王森, 张思成, 庄龙. 基于特征融合的通信语音干扰效果客观评估[J]. 通信学报, 2023, 44(3): 105-116. |
[4] | 杨宏宇, 杨海云, 张良, 成翔. 基于特征依赖图的源代码漏洞检测方法[J]. 通信学报, 2023, 44(1): 103-117. |
[5] | 何世文, 袁军, 安振宇, 张敏, 黄永明, 张尧学. 基于图神经网络的联合用户调度与波束成形优化算法[J]. 通信学报, 2022, 43(7): 73-84. |
[6] | 冷涛, 蔡利君, 于爱民, 朱子元, 马建刚, 李超飞, 牛瑞丞, 孟丹. 基于系统溯源图的威胁发现与取证分析综述[J]. 通信学报, 2022, 43(7): 172-188. |
[7] | 李昂, 陈建新, 魏昕, 周亮. 面向6G的跨模态信号重建技术[J]. 通信学报, 2022, 43(6): 28-40. |
[8] | 王晓丹, 李京泰, 宋亚飞. DDAC:面向卷积神经网络图像隐写分析模型的特征提取方法[J]. 通信学报, 2022, 43(5): 68-81. |
[9] | 廖育荣, 王海宁, 林存宝, 李阳, 方宇强, 倪淑燕. 基于深度学习的光学遥感图像目标检测研究进展[J]. 通信学报, 2022, 43(5): 190-203. |
[10] | 张帆, 黄赟, 方子茁, 郭威. 卷积神经网络的损失最小训练后参数量化方法[J]. 通信学报, 2022, 43(4): 114-122. |
[11] | 朱政宇, 侯庚旺, 黄崇文, 孙钢灿, 郝万明, 梁静. 基于并行CNN的RIS辅助D2D保密通信系统资源分配算法[J]. 通信学报, 2022, 43(3): 172-179. |
[12] | 龙华, 黄张衡, 邵玉斌, 杜庆治, 苏树盟. 基于改进CFCC特征提取的语种识别算法研究[J]. 通信学报, 2022, 43(12): 211-221. |
[13] | 朱政宇, 陈鹏飞, 王梓晅, 巩克现, 吴迪, 王忠勇. 基于Swin-Transformer的短波协议信号识别[J]. 通信学报, 2022, 43(11): 127-135. |
[14] | 熊金波, 周永洁, 毕仁万, 万良, 田有亮. 边缘协同的轻量级隐私保护分类框架[J]. 通信学报, 2022, 43(1): 127-137. |
[15] | 吴翼腾, 刘伟, 于洪涛. 图神经网络的标签翻转对抗攻击[J]. 通信学报, 2021, 42(9): 65-74. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||
|