基于视频时域感知特性的恰可察觉失真模型

doi:10.11959/j.issn.1000-0801.2022030

电信科学 ›› 2022, Vol. 38 ›› Issue (2): 92-102.doi: 10.11959/j.issn.1000-0801.2022030

基于视频时域感知特性的恰可察觉失真模型

邢亚芬¹, 殷海兵¹, 王鸿奎¹^,², 骆琼华¹

¹ 杭州电子科技大学通信工程学院，浙江杭州 310018
² 华中科技大学电子信息与通信学院，湖北武汉 430074

修回日期:2021-12-22 出版日期:2022-02-20 发布日期:2022-02-01
作者简介:邢亚芬（1997- ），女，杭州电子科技大学硕士生，主要研究方向为感知视频编码
殷海兵（1974- ），男，博士，杭州电子科技大学教授，主要研究方向为数字视频编解码
王鸿奎（1990- ），男，华中科技大学博士生，主要研究方向为感知视频编码
骆琼华（1998- ），女，杭州电子科技大学硕士生，主要研究方向为感知视频编码
基金资助:
国家自然科学基金资助项目(61972123);国家自然科学基金资助项目(61931008);国家自然科学基金资助项目(62031009);浙江省“尖兵”研发攻关计划项目(2022C01068)

Video temporal perception characteristics based just noticeable difference model

Yafen1 XING¹, Haibing YIN¹, Hongkui WANG¹^,², Qionghua LUO¹

¹ College of Communication Engineering, Hangzhou Dianzi University, Hangzhou 310018, China
² College of Electronic Information and Communications, Huazhong University of Science and Technology, Wuhan 430074, China

Revised:2021-12-22 Online:2022-02-20 Published:2022-02-01
Supported by:
The National Natural Science Foundation of China(61972123);The National Natural Science Foundation of China(61931008);The National Natural Science Foundation of China(62031009);Zhejiang Provincial Vanguard Research and Development Project(2022C01068)

摘要/Abstract

摘要：

现有的时域恰可察觉失真（just noticeable distortion，JND）模型对时域特征参量的作用刻画尚不够充分，导致空时域JND模型精度不够理想。针对此问题，提出能准确刻画视频时域特性的特征参量以及异质特征参量同质化融合方法，并基于此改进时域JND模型。关注前景/背景运动、时域持续时间、时域预测残差波动强度、帧间预测残差等特征参量，用来刻画视频内容的时域特征；基于人眼视觉系统（human visual system， HVS）特性探索感知概率密度函数，将异质特征参量统一映射到自信息和信息熵尺度上，实现同质化融合度量；从能量分配的角度探究视觉注意与掩蔽的耦合方法，并据此构建时域JND权重模型。在空域JND阈值的基础上，融合时域权重以得到更加准确的空时域JND模型。为了评估空时域JND模型的性能，进行了主观质量评估实验，与现有的JND模型相比，在感知质量接近的情况下，提出的空时域JND模型能够容忍更多失真，具有更强的掩藏噪声的能力。

关键词: 恰可察觉失真, 人眼视觉特性, 视觉掩蔽, 视觉注意, 自信息, 信息熵

Abstract:

The existing temporal domain JND(just noticeable distortion) models are not sufficient to depict the interaction between temporal parameters and HVS characteristics, leading to insufficient accuracy of the spatial-temporal JND model.To solve this problem, feature parameters that can accurately describe the temporal characteristics of the video were explored and extracted, as well as a homogenization method for fusing heterogeneous feature parameters, and the temporal domain JND model based on this was improved.The feature parameters were investigated including foreground and background motion, temporal duration along the motion trajectory, residual fluctuation intensity along motion trajectory and adjacent inter-frame prediction residual, etc., which were used to characterize the temporal characteristics.Probability density functions for these feature parameters in the perception sense according to the HVS(human visual system) characteristics were proposed, and uniformly mapping the heterogeneous feature parameters to the scales of self-information and information entropy to achieve a homogeneous fusion measurement.The coupling method of visual attention and masking was explored from the perspective of energy distribution, and the temporal-domain JND weight model was constructed accordingly.On the basis of the spatial JND threshold, the temporal domain weights was integrated to develop a more accurate spatial-temporal JND model.In order to evaluate the performance of the spatiotemporal JND model, a subjective quality evaluation experiment was conducted.Experimental results justify the effectiveness of the proposed model.

Key words: JND, HVS characteristics, visual masking, visual attention, self-information, information entropy

中图分类号:

TN919

邢亚芬, 殷海兵, 王鸿奎, 骆琼华. 基于视频时域感知特性的恰可察觉失真模型[J]. 电信科学, 2022, 38(2): 92-102.

Yafen1 XING, Haibing YIN, Hongkui WANG, Qionghua LUO. Video temporal perception characteristics based just noticeable difference model[J]. Telecommunications Science, 2022, 38(2): 92-102.

图/表 8

图1

图2

图3

图4

图5

图6

图7

表1

参考文献 25

[1]	YUAN D , ZHAO T S , XU Y W ,et al. Visual JND:a perceptual measurement in video coding[J]. IEEE Access, 2019(7): 29014-29022.
[2]	KORHONEN J . Two-level approach for no-reference consumer video quality assessment[J]. IEEE Transactions on Image Processing:a Publication of the IEEE Signal Processing Society, 2019,28(12): 5923-5938.
[3]	CHOU C H , LI Y C . A perceptually tuned subband image coder based on the measure of just-noticeable-distortion profile[J]. IEEE Transactions on Circuits and Systems for Video Technology, 1995,5(6): 467-476.
[4]	YANG X K , LING W S , LU Z K ,et al. Just noticeable distortion model and its applications in video coding[J]. Signal Processing:Image Communication, 2005,20(7): 662-680.
[5]	CHIN Y J , BERGER T . A software-only videocodec using pixelwise conditional differential replenishment and perceptual enhancements[J]. IEEE Transactions on Circuits and Systems for Video Technology, 1999,9(3): 438-450.
[6]	KELLY D H . Motion and vision.II.Stabilized spatio-temporal threshold surface[J]. Journal of the Optical Society of America, 1979,69(10): 1340-1349.
[7]	DALY S . Engineering observations from spatiovelocity and spatiotemporal visual models[M]. Vision Models and Applications to Image and Video Processing. Heidelberg: Springer, 2001: 179-200.
[8]	JIA Y , LIN W , KASSIM A A . Estimating just-noticeable distortion for video[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2006,16(7): 820-829.
[9]	WEI Z Y , NGAN K N . Spatio-temporal just noticeable distortion profile for grey scale image/video in DCT domain[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2009,19(3): 337-346.
[10]	BAE S H , KIM M . A DCT-based total JND profile for spatiotemporal and foveated masking effects[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2017,27(6): 1196-1207.
[11]	WANG H Q , GAN W H , HU S D ,et al. MCL-JCV:a JND-based H.264/AVC video quality assessment dataset[C]// Proceedings of 2016 IEEE International Conference on Image Processing. Piscataway:IEEE Press, 2016: 1509-1513.
[12]	WANG H Q , KATSAVOUNIDIS I , ZHOU J T ,et al. VideoSet:a large-scale compressed video quality dataset based on JND measurement[J]. Journal of Visual Communication and Image Representation, 2017,46: 292-302.
[13]	KI S , BAE S H , KIM M ,et al. Learning-based just-noticeable-quantization- distortion modeling for perceptual video coding[J]. IEEE Transactions on Image Processing, 2018,27(7): 3178-3193.
[14]	LIU H H , ZHANG Y , ZHANG H ,et al. Deep learning-based picture-wise just noticeable distortion prediction model for image compression[J]. IEEE Transactions on Image Processing, 2020,29: 641-656.
[15]	ZHANG Y , LIU H H , YANG Y ,et al. Deep learning based just noticeable difference and perceptual quality prediction models for compressed video[J]. IEEE Transactions on Circuits and Systems for Video Technology, 6224(99): 1.
[16]	WANG Z , LI Q . Video quality assessment using a statistical model of human visual speed perception[J]. Journal of the Optical Society of America A,Optics,Image Science,and Vision, 2007,24(12): B61-B69.
[17]	MACKNIK S L , LIVINGSTONE M S . Neuronal correlates of visibility and invisibility in the primate visual system[J]. Nature Neuroscience, 1998,1(2): 144-149.
[18]	STOCKER A A , SIMONCELLI E P . Noise characteristics and prior expectations in human visual speed perception[J]. Nature Neuroscience, 2006,9(4): 578-585.
[19]	PETERSON H A , AHUMADA A J J , WATSON A B . Improved detection model for DCT coefficient quantization[C]// Proceedings of the Human Vision,Visual Processing,and Digital Dis play IV,F,1993.[S.l.:s.n.], 1993: 191-201.
[20]	BARKOWSKY M , BIALKOWSKI J , ESKOFIER B ,et al. Temporal trajectory aware video quality measure[J]. IEEE Journal of Selected Topics in Signal Processing, 2009,3(2): 266-279.
[21]	HU S D , JIN L N , WANG H L ,et al. Objective video quality assessment based on perceptually weighted mean squared error[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2017,27(9): 1844-1855.
[22]	CHEN Z , GUILLEMOT C . Perceptually-friendly H.264/AVC video coding based on foveated just-noticeable-distortion model[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2010,20(6): 806-819.
[23]	WU J , QI F , SHI G ,et al. Non-local spatial redundancy reduction for bottom-up saliency estimation[J]. Journal of Visual Communication and Image Representation, 2012,23(7): 1158-1166.
[24]	崔帅南, 彭宗举, 邹文辉 ,等. 多特征融合的合成视点立体图像质量评价[J]. 电信科学, 2019,35(5): 104-112.
	CUI S N , PENG Z J , ZOU W H ,et al. Quality assessment of synthetic viewpoint stereo image with multi-feature fusion[J]. Telecommunications Science, 2019,35(5): 104-112.
[25]	ZENG Z P , ZENG H Q , CHEN J ,et al. Visual attention guide dpixel-wise just noticeable difference model[J]. IEEE Access, 2019,7: 132111-132119.

测试序列	Wei^[9]提出的模型			Bae^[10]提出的模型			Zeng^[25]提出的模型			本文所提模型
测试序列	PSNR	VMAF	DMOS	PSNR	VMAF	DMOS	PSNR	VMAF	DMOS	PSNR	VMAF	DMOS
Ba	26.66	99.90	12.67	27.09	99.76	15.91	33.53	99.80	19.52	25.88	99.98	9.33
BT	26.48	99.17	16.50	25.76	98.26	19.65	34.34	99.61	18.39	25.40	99.31	12.89
CT	28.20	97.36	11.95	28.73	96.88	11.67	31.33	94.35	22.58	27.89	98.78	7.38
Ki	27.50	98.20	15.58	27.55	95.49	22.20	29.92	94.27	22.67	27.11	98.90	13.32
Pa	28.21	97.56	16.89	28.20	95.37	18.11	30.41	93.52	20.12	27.69	98.26	15.11
BD	28.80	99.76	14.33	29.53	99.33	12.83	33.61	98.59	19.12	28.17	99.95	11.50
BM	28.16	99.76	12.67	30.00	99.84	10.78	32.09	99.15	17.33	26.43	99.97	9.33
PS	28.26	96.72	13.33	28.72	96.08	15.00	32.62	95.14	19.65	27.91	97.94	11.20
RH	26.88	99.98	15.17	27.88	99.90	18.24	32.56	99.96	17.17	25.89	99.98	14.89
平均值	27.68	98.71	14.34	28.16	97.88	16.04	32.27	97.16	19.62	26.93	99.23	11.66

基于视频时域感知特性的恰可察觉失真模型

Video temporal perception characteristics based just noticeable difference model

在线阅读

PDF下载

可视化

摘要/Abstract

引用本文

使用本文

图/表 8

参考文献 25

相关文章 9

Metrics

推荐阅读 0

[1]	骆琼华, 王鸿奎, 殷海兵, 邢亚芬. 基于熵掩蔽的DCT域恰可察觉失真模型[J]. 电信科学, 2023, 39(2): 59-70.
[2]	李攀攀, 谢正霞, 王赠凯, 靳锐. 开放互联网环境基于信息熵的信息传播影响力计算方法[J]. 电信科学, 2022, 38(4): 90-100.
[3]	王金华,应娜,朱辰都,刘兆森,蔡哲栋. 基于语谱图提取深度空间注意特征的语音情感识别算法[J]. 电信科学, 2019, 35(7): 100-108.
[4]	崔帅南,彭宗举,邹文辉,陈芬,陈华. 多特征融合的合成视点立体图像质量评价[J]. 电信科学, 2019, 35(5): 104-112.
[5]	陈铁明, 项彬彬, 吕明琪, 陈波, 江颉. 基于字节码图像和深度学习的Android恶意应用检测[J]. 电信科学, 2019, 35(1): 9-17.
[6]	王士培,彭宗举,陈芬,蒋刚毅,郁梅. 一种自由视点视频系统中立体图像质量的评价方法[J]. 电信科学, 2018, 34(3): 76-85.
[7]	汪敏娟,嵇正鹏,吕超. 基于预测度量值的IPTV用户行为规则预测算法[J]. 电信科学, 2016, 32(5): 160-165.
[8]	琚春华,邹江波. 基于信息熵差异性度量的数据流增量集成分类算法[J]. 电信科学, 2015, 31(2): 86-96.
[9]	朱少敏,张志强. 混合变换域数字水印在电力系统信息安全中的应用[J]. 电信科学, 2013, 29(11): 82-86.