基于图像块码本模型的监控视频背景参考帧生成方法

doi:10.11959/j.issn.1000-436x.2023003

通信学报 ›› 2023, Vol. 44 ›› Issue (1): 129-141.doi: 10.11959/j.issn.1000-436x.2023003

基于图像块码本模型的监控视频背景参考帧生成方法

张伟, 王宇, 陈新怡, 王延文, 景庆阳, 雷为民

东北大学计算机科学与工程学院，辽宁沈阳 110169

修回日期:2022-12-07 出版日期:2023-01-25 发布日期:2023-01-01
作者简介:张伟（1980- ），女，山东济宁人，博士，东北大学讲师，主要研究方向为多媒体智能信号处理和网络多径传输优化
王宇（1997- ），男，黑龙江齐齐哈尔人，东北大学硕士生，主要研究方向为多媒体智能信号处理
陈新怡（1994- ），女，河北承德人，东北大学博士生，主要研究方向为计算机视觉、视频图像压缩编码
王延文（1998- ），女，辽宁辽阳人，东北大学博士生，主要研究方向为计算机视觉、视频图像压缩编码
景庆阳（1994- ），女，辽宁沈阳人，东北大学博士生，主要研究方向为计算机视觉、视频图像压缩编码
雷为民（1969- ），男，山西平遥人，博士，东北大学教授，主要研究方向为多媒体智能信号处理、网络多径传输优化和工业实时通信技术等
基金资助:
国家重点研发计划基金资助项目(2018YFB1702000);中央高校基本科研业务费专项资金资助项目(N2216010)

Background reference frame generation method for surveillance video based on image block codebook model

Wei ZHANG, Yu WANG, Xinyi CHEN, Yanwen WANG, Qingyang JING, Weimin LEI

School of Computer Science and Engineering, Northeastern University, Shenyang 110169, China

Revised:2022-12-07 Online:2023-01-25 Published:2023-01-01
Supported by:
The National Key Research and Development Program of China(2018YFB1702000);The Fundamental Re-search Funds for the Central Universities(N2216010)

摘要/Abstract

摘要：

为解决背景参考帧受前景污染严重，以及传输背景参考帧导致的码率突增等问题，针对背景较稳定的监控视频，提出一种以图像块为基本单元的渐进式背景参考帧生成方法。所提方法建立了基于聚类的图像块码本模型，利用基于感知哈希的码元匹配，将视频序列中处于同一位置的图像块进行聚类；利用背景图像区域特性准确检测背景码元；利用码本模型从不同帧中检测出背景图像块生成完整的背景参考帧。实验结果表明，所提方法编码效率相比标准HM16.20在亮度分量上提升17.89%，有效提升了背景参考帧生成质量，且时间复杂度满足视频实时性需求。

关键词: 监控视频, 背景建模, 视频编码, 码本模型, 背景参考帧

Abstract:

To solve the problems that the background reference frames are seriously contaminated by the foreground, and the bit rate increases suddenly incurred by the one-time transmission of the background frames, a progressive background frame generation method with image block as the basic unit was proposed for surveillance video application.An image block codebook model based on clustering was formulated.The image blocks at the same position in the video sequence were effectively clustered by using perceptual hash-based element matching.The background symbol was accurately detected by using the characteristics of the background image area.A complete background frame was produced by extracting the background blocks in different frames based on the codebook model.Experimental results demonstrate that the proposed method achieves 17.89% coding efficiency for luma component compared with standard HM16.20, and can effectively improve the quality of the produced background reference frame.Besides, the proposed method complexity meets the real-time requirements of video applications.

Key words: surveillance video, background modeling, video coding, codebook model, background reference frame

中图分类号:

TN92

张伟, 王宇, 陈新怡, 王延文, 景庆阳, 雷为民. 基于图像块码本模型的监控视频背景参考帧生成方法[J]. 通信学报, 2023, 44(1): 129-141.

Wei ZHANG, Yu WANG, Xinyi CHEN, Yanwen WANG, Qingyang JING, Weimin LEI. Background reference frame generation method for surveillance video based on image block codebook model[J]. Journal on Communications, 2023, 44(1): 129-141.

图/表 25

图1

图2

表1

表2

图3

图4

图5

图6

图7

图8

图9

图10

图11

图12

表3

不同方法时间复杂度对比"

测试序列	运行500帧所需时间/s
测试序列	GMM	SWRA	CB	本文方法
Crossroad	20.466	17.321	19.313	15.463
Campus	20.175	17.512	19.842	16.385
Classover	20.023	17.608	19.815	16.337
平均	$20 . 221$	$17 . 480$	$19 . 657$	$16 . 061$

表3

图13

图14

表4

图15

图16

图17

表5

表6

图18

图19

参考文献 39

[1]	WIEGAND T , SULLIVAN G J , BJONTEGAARD G ,et al. Overview of the H.264/AVC video coding standard[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2003,13(7): 560-576.
[2]	SULLIVAN G J , OHM J R , HAN W J ,et al. Overview of the high efficiency video coding (HEVC) standard[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2012,22(12): 1649-1668.
[3]	ZHANG X G , HUANG T J , TIAN Y H ,et al. Background-modelingbased adaptive prediction for surveillance video coding[J]. IEEE Transactions on Image Processing, 2014,23(2): 769-784.
[4]	ZHANG X G , TIAN Y H , HUANG T J ,et al. Optimizing the hierarchical prediction and coding in HEVC for surveillance and conference videos with background modeling[J]. IEEE Transactions on Image Processing, 2014,23(10): 4511-4526.
[5]	WIEGAND T , ZHANG X Z , GIROD B . Long-term memory motion-compensated prediction[J]. IEEE Transactions on Circuits and Systems for Video Technology, 1999,9(1): 70-84.
[6]	TUNG C C , YU W H , CHUAN Y T ,et al. Single reference frame multiple current macroblocks scheme for multi-frame motion estimation in H.264/AVC[C]// 2005 IEEE International Symposium on Circuits and Systems (ISCAS). Piscataway:IEEE Press, 2005: 1790-1793.
[7]	GORUR P , AMRUTUR B . Skip decision and reference frame selection for low-complexity H.264/AVC surveillance video coding[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2014,24(7): 1156-1169.
[8]	PAUL M , LIN W S , LAU C T ,et al. Video coding using the most common frame in scene[C]// 2010 IEEE International Conference on Acoustics,Speech,and Signal Processing (ICASSP). Piscataway:IEEE Press, 2010: 734-737.
[9]	ZHAO L , WANG S Q , WANG S S ,et al. Enhanced surveillance video compression with dual reference frames generation[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2022,32(3): 1592-1606.
[10]	CHEN F D , LI H Q , LI L ,et al. Block-composed background reference for high efficiency video coding[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2017,27(12): 2639-2651.
[11]	ZHANG X , HUANG T , TIAN Y ,et al. Fast and efficient transcoding based on low-complexity background modeling and adaptive block classification[J]. IEEE Transactions on Multimedia, 2013,15(8): 1769-1785.
[12]	LI H R , DING W P , SHI Y H ,et al. A double background based coding scheme for surveillance videos[C]// 2018 Data Compression Conference (DCC). Piscataway:IEEE Press, 2018: 420-420.
[13]	WANG X , HU R , WANG Z ,et al. Virtual background reference frame based satellite video coding[J]. IEEE Signal Processing Letters, 2018,25(10): 1445-1449.
[14]	MA C Y , LIU D , PENG X L ,et al. Surveillance video coding with vehicle library[C]// 2017 IEEE International Conference on Image Processing (ICIP). Piscataway:IEEE Press, 2017: 270-274.
[15]	MA C Y , LIU D , PENG X L ,et al. Traffic surveillance video coding with libraries of vehicles and background[J]. Journal of Visual Communication and Image Representation, 2019,60: 426-440.
[16]	NACCARI M , PEREIRA F . Advanced H.264/AVC-based perceptual video coding:architecture,tools,and assessment[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2011,21(6): 766-782.
[17]	XU J , GUO J , BAO J . A ROI encryption scheme for H.264 video based on moving object detection[C]// 2013 2nd International Symposium on Instrumentation and Measurement,Sensor Network and Automation (IMSNA). Piscataway:IEEE Press, 2013: 494-497.
[18]	LEUVEN S V , SCHEVENSTEEN K V , DAMS T ,et al. An implementation of multiple region-of-interest models in H.264/AVC[J]. Signal Processing for Image Enhancement and Multimedia Processing, 2008,31: 215-225.
[19]	ZHOU X , YANG C , YU W . Moving object detection by detecting contiguous outliers in the low-rank representation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013,35(3): 597-610.
[20]	马思伟 . AVS视频编码标准技术回顾及最新进展[J]. 计算机研究与发展, 2015,52(1): 27-37.
	MA S W . History and recent development of AVS video coding standards[J]. Journal of Computer Research and Development, 2015,52(1): 27-37.
[21]	MEDDEB M , CAGNAZZO M , PESQUET B P . ROI-based rate control using tiles for an HEVC encoded video stream over a lossy network[C]// 2015 IEEE International Conference on Image Processing (ICIP). Piscataway:IEEE Press, 2015: 1389-1393.
[22]	ZHANG Z , JING T , HAN J ,et al. A new rate control scheme for video coding based on region of interest[J]. IEEE Access, 2017,5: 13677-13688.
[23]	PATEL Z , RAO K R . Image segmentation approach for realizing zoomable streaming HEVC video[C]// 2015 International Conference on Science and Technology (TICST). Piscataway:IEEE Press, 2015: 76-82.
[24]	OQUAB M , STOCK P , GAFNI O ,et al. Low bandwidth video-chat compression using deep generative models[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Piscataway:IEEE Press, 2021: 2388-2397.
[25]	FENG D , HUANG Y , ZHANG Y ,et al. A generative compression framework for low bandwidth video conference[C]// 2021 IEEE International Conference on Multimedia ＆ Expo Workshops (ICMEW). Piscataway:IEEE Press, 2021: 1-6.
[26]	WU Y , HE T , CHEN Z . Memorize,then recall:a generative framework for low bit-rate surveillance video compression[C]// 2020 IEEE International Symposium on Circuits and Systems. Piscataway:IEEE Press, 2020: 1-5.
[27]	KIM S , PARK J S , BAMPIS C G ,et al. Adversarial video compression guided by soft edge detection[C]// 2020 IEEE International Conference on Acoustics,Speech and Signal Processing (ICASSP). Piscataway:IEEE Press, 2020: 2193-2197.
[28]	HU Y , YANG S , YANG W ,et al. Towards coding for human and machine vision:a scalable image coding approach[C]// 2020 IEEE International Conference on Multimedia and Expo (ICME). Piscataway:IEEE Press, 2020: 1-6.
[29]	ISOLA P , ZHU J Y , ZHOU T ,et al. Image-to-image translation with conditional adversarial networks[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway:IEEE Press, 2017: 1125-1134.
[30]	BENEZETH Y , JODOIN P M , EMILE B ,et al. Review and evaluation of commonly-implemented background subtraction algorithms[C]// 2008 19th International Conference on Pattern Recognition (ICPR). Piscataway:IEEE Press, 2008: 1-4.
[31]	SOBRAL A , VACAVANT A . A comprehensive review of background subtraction algorithms evaluated with synthetic and real videos[J]. Computer Vision and Image Understanding, 2014,122: 4-21.
[32]	BOUWMANS T , EL B F , VACHON B . Background modeling using mixture of Gaussians for foreground detection-a survey[J]. Recent Patents on Computer Science, 2008,1(3): 219-237.
[33]	SARANLI A , . A Gaussian-mixture based approach to spatial image background modeling and compensation[C]// 2007 15th European Signal Processing Conference (EUSIPCO). Piscataway:IEEE Press, 2007: 1457-1461.
[34]	LO B P L , VELASTIN S A . Automatic congestion detection system for underground platforms[C]// 2001 International Symposium on Intelligent Multimedia,Video and Speech Processing (ISIMP). Piscataway:IEEE Press, 2001: 158-161.
[35]	CUCCHIARA R , GRANA C , PICCARDI M ,et al. Detecting moving objects,ghosts,and shadows in video streams[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2003,25(10): 1337-1342.
[36]	KIM K , CHALIDABHONGSE T H , HARWOOD D ,et al. Real-time foreground-background segmentation using codebook model[J]. Real-Time Imaging, 2005,11(3): 172-185.
[37]	DOSHI A , TRIVEDI M . “Hybrid cone-cylinder” codebook model for foreground detection with shadow and highlight suppression[C]// 2006 IEEE International Conference on Video and Signal Based Surveillance. Piscataway:IEEE Press, 2006: 19-19.
[38]	HARALICK R M , SHANMUGAM K , DINSTEIN I . Textural features for image classification[J]. IEEE Transactions on Systems,Man,and Cybernetics, 1973,SMC-3(6): 610-621.
[39]	GAO W , TIAN Y , HUANG T ,et al. The IEEE 1857 standard:empowering smart video surveillance systems[J]. IEEE Intelligent Systems, 2013,29(5): 30-39.

成员	含义
ce₁,ce₂,…,ce_L	该码本包含的L个码元
L	该码本包含的码元数量
f_max	该码本中各码元属性f的最大值
λ_max	该码本中各码元属性λ的最大值
flag_bgce	该码本是否已建立背景码元，值为0或1

成员	含义
mat	中心图像块，为该码元中各图像块的平均图像块
q	该码元最后一次更新图像块的时间
f	该码元包含的图像块数量
λ	该码元中时间相邻图像块之间的间隔帧数累加值
isBgce	该码元是否为背景码元，值为0或1
isMark	该码元是否已标记，值为0或1

视频序列	BD-rate(本文方法与HM16.20相比)
视频序列	Y	U	V	YUV
Bank	-21.24%	-78.46%	-80.24%	-35.08%
Campus	-20.99%	-75.37%	-79.45%	-26.47%
Classover	-18.02%	-75.20%	-78.32%	-23.84%
Crossroad	-8.37%	-75.60%	-69.32%	-18.61%
Office	-9.19%	-73.79%	-70.50%	-16.10%
Overbridge	-29.54%	-80.18%	-79.46%	-39.95%
平均	-17.89%	-76.43%	-76.22%	-26.68%

序列名	分辨率	帧率/(fame·s^-1)	帧数/次	采样格式
Deadline	320×256	30	1 000	4:2:0
Students	320×256	30	1 000	4:2:0
Johnny	1 280×704	60	600	4:2:0
Vidyo3	1 280×704	60	600	4:2:0

视频序列	BD-rate(本文方法与HM16.20相比)
视频序列	Y	U	V	YUV
Deadline	-15.91%	-34.03%	-32.79%	-17.69%
Students	-29.29%	-48.59%	-48.35%	-31.08%
Johnny	-8.57%	-44.06%	-33.28%	-11.64%
Vidyo3	-16.61%	-41.90%	-59.51%	-19.27%
平均	-17.59%	-42.15%	-43.48%	-19.92%

基于图像块码本模型的监控视频背景参考帧生成方法

Background reference frame generation method for surveillance video based on image block codebook model

在线阅读

PDF下载

可视化

摘要/Abstract

引用本文

使用本文

图/表 25

参考文献 39

相关文章 15

Metrics

推荐阅读 0

[1]	郭红伟, 朱策, 杨栩, 罗雷. 基于失真反向传播的时域依赖率失真优化[J]. 通信学报, 2022, 43(12): 222-232.
[2]	公衍超, 王玲, 刘颖, 杨楷芳, 林庆帆, 王富平. 视频主观观测实验启发的HEVC感知帧内码率控制[J]. 通信学报, 2021, 42(8): 90-102.
[3]	李跃,杨高波,丁湘陵,朱亚培. 基于学习模型的3D-HEVC提前Merge模式终止算法[J]. 通信学报, 2019, 40(7): 104-113.
[4]	王汝言,杨衍,吴大鹏. QoE感知的FiWi视频分发机制[J]. 通信学报, 2018, 39(1): 1-13.
[5]	朱威,张训华,王财盛,张桦. 基于时空相关性的HEVC帧间模式决策快速算法[J]. 通信学报, 2016, 37(4): 64-73.
[6]	刘杰平,王琴玲,何越盛,韦岗. 分布式编码中广义伽马分布相关噪声模型研究[J]. 通信学报, 2016, 37(3): 33-39.
[7]	刘晟,彭宗举,陈嘉丽,陈芬,郁梅,蒋刚毅. 基于多类支持向量机的3D-HEVC深度视频帧内编码快速算法[J]. 通信学报, 2016, 37(11): 181-188.
[8]	陈健，惠超，阔永红. 无反馈分布式视频编码中的速率控制算法[J]. 通信学报, 2014, 35(6): 5-38.
[9]	陈健,惠超,阔永红. 无反馈分布式视频编码中的速率控制算法[J]. 通信学报, 2014, 35(6): 32-38.
[10]	宋传鸣,郭延文,王相海,刘丹. 基于模糊量化和2 bit深度像素的运动估计算法[J]. 通信学报, 2013, 34(7): 59-70.
[11]	宋传鸣1,2,3，郭延文2，王相海1,2,3，刘丹1. 基于模糊量化和2 bit深度像素的运动估计算法[J]. 通信学报, 2013, 34(7): 7-70.
[12]	刘西蒙1，刘光军1，马建峰2，熊金波2. 可伸缩视频流的安全网络编码方案[J]. 通信学报, 2013, 34(5): 21-191.
[13]	刘西蒙,刘光军,马建峰,熊金波. 可伸缩视频流的安全网络编码方案[J]. 通信学报, 2013, 34(5): 184-191.
[14]	解文华,易本顺,肖进胜,甘良才. 基于像素与子块的背景建模级联算法[J]. 通信学报, 2013, 34(4): 194-200.
[15]	解文华，易本顺，肖进胜，甘良才. 基于像素与子块的背景建模级联算法[J]. 通信学报, 2013, 34(4): 24-200.