基于全局-局部自注意力网络的视频异常检测方法

doi:10.11959/j.issn.1000-436x.2023151

通信学报 ›› 2023, Vol. 44 ›› Issue (8): 241-250.doi: 10.11959/j.issn.1000-436x.2023151

• 学术通信 • 上一篇

基于全局-局部自注意力网络的视频异常检测方法

杨静¹^,², 吴成茂³, 周流平¹

¹ 广州铁路职业技术学院信息工程学院，广东广州 510430
² 菲律宾圣保罗大学，土格加劳 3500
³ 西安邮电大学电子工程学院，陕西西安 710121

修回日期:2023-07-19 出版日期:2023-08-01 发布日期:2023-08-01
作者简介:杨静（1986- ），女，陕西西安人，圣保罗大学博士生，广州铁路职业技术学院讲师，主要研究方向为智能视觉识别等
吴成茂（1968- ），男，四川仪陇人，西安邮电大学高级工程师，主要研究方向为智能信息处理、非线性动力系统与混沌、信息安全等
周流平（1973- ），男，湖南醴陵人，广州铁路职业技术学院高级工程师，主要研究方向为通信技术、智能信息处理等
基金资助:
广东省高校青年创新人才基金资助项目(2020KQNCX198);广州市基础研究计划基础与应用基础研究基金资助项目(104267483017)

Novel video anomaly detection method based on global-local self-attention network

Jing YANG¹^,², Chengmao WU³, Liuping ZHOU¹

¹ School of Information Engineering, Guang Zhou Railway Ploytechnic, Guangzhou 510430, China
² St.Paul University Phillippines, Tuguegarao 3500, Philippines
³ School of Electronic Engineering, Xi’an University of Posts and Telecommunications, Xi’an 710121, China

Revised:2023-07-19 Online:2023-08-01 Published:2023-08-01
Supported by:
The Young Innovative Talents Project of Guangdong Province(2020KQNCX198);Basic and Applied Basic Research Project of Guangzhou Basic Research Program(104267483017)

摘要/Abstract

摘要：

为提升视频异常检测精度，提出一种基于全局-局部自注意力网络的视频异常检测方法。首先，融合视频序列与其对应的 RGB 序列凸显物体的运动变化；其次，通过膨胀卷积层捕获视频序列在局部区域的时序相关性，并利用自注意力网络计算视频全局时序的依赖性，同时，依靠增加基础网络U-Net的深度并结合相关运动和表征约束对网络模型进行端到端的训练学习，从而提升模型的检测精度和鲁棒性；最后，对公开数据集 UCSD Ped2、CUHK Avenue 和ShanghaiTech 进行测试并对所得结果进行可视化分析。实验结果表明，所提方法的检测精度AUC值分别达到了97.4%、86.8%和73.2%，其性能明显优于对比方法。

关键词: 视频异常检测, 自注意力, 预测, 重构

Abstract:

In order to improve the accuracy of video anomaly detection, a novel video anomaly detection method based on global-local self-attention network was proposed.Firstly, the video sequence and the corresponding RGB sequence were fused to highlight the motion change of the object.Secondly, the temporal correlation of the video sequence in the local area was captured by the expansion convolution layer, along with the self-attention network was utilized to compute the global temporal dependencies of the video sequence.Meanwhile, by deepening the basic network U-Net and combining the relevant motion and representation constraints, the network model was trained end-to-end to improve the detection accuracy and robustness of the model.Finally, experiments were carried out on the public data sets UCSD Ped2, CUHK Avenue and ShanghaiTech, as well as the test results were visually analyzed.The experimental results show that the detection accuracy AUC of the proposed method reaches 97.4%, 86.8% and 73.2% respectively, which is obviously better than that of the compared methods.

Key words: video anomaly detection, self-attention, prediction, reconstruction

中图分类号:

TP391.41

杨静, 吴成茂, 周流平. 基于全局-局部自注意力网络的视频异常检测方法[J]. 通信学报, 2023, 44(8): 241-250.

Jing YANG, Chengmao WU, Liuping ZHOU. Novel video anomaly detection method based on global-local self-attention network[J]. Journal on Communications, 2023, 44(8): 241-250.

图/表 7

图1

图2

表1

表2

表3

图3

图4

参考文献 58

[1]	RAMACHANDRA B , JONES M J , VATSAVAI R R . A survey of single-scene video anomaly detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022,44(5): 2293-2312.
[2]	SINGH A , JONES M J , LEARNED-MILLER E G . EVAL:explainable video anomaly localization[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2023: 18717-18726.
[3]	SALIGRAMA V , KONRAD J , JODOIN P M . Video anomaly identification[J]. IEEE Signal Processing Magazine, 2010,27(5): 18-33.
[4]	LUO W X , LIU W , LIAN D Z ,et al. Video anomaly detection with sparse coding inspired deep neural networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021,43(3): 1070-1084.
[5]	?ENG?NüL E , SAMET R , ABU A Q ,et al. An analysis of artificial intelligence techniques in surveillance video anomaly detection:a comprehensive survey[J]. Applied Sciences, 2023,13(8): 49-56.
[6]	HU T , LONG C , XIAO C . CRD-CGAN:category-consistent and relativistic constraints for diverse text-to-image generation[J]. arXiv Preprint,arXiv:2107.13516, 2021.
[7]	THAKARE K V , RAGHUWANSHI Y , DOGRA D P ,et al. DyAnNet:A Scene Dynamicity Guided Self-Trained Video Anomaly Detection Network[C]// Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. Piscataway:IEEE Press, 2023: 5541-5550.
[8]	HU T , LONG C J , XIAO C X . A novel visual representation on text using diverse conditional GAN for visual recognition[J]. IEEE Transactions on Image Processing, 2021,30: 3499-3512.
[9]	ISLAM A , LONG C J , BASHARAT A ,et al. DOA-GAN:dual-order attentive generative adversarial network for image copy-move forgery detection and localization[C]// Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway:IEEE Press, 2020: 4675-4684.
[10]	GU T P , CHEN G Y , LI J L ,et al. Stochastic trajectory prediction via motion indeterminacy diffusion[C]// Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2022: 17092-17101.
[11]	ISLAM A , LONG C J , RADKE R . A hybrid attention mechanism for weakly-supervised temporal action localization[C]// Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto:AAAI Press, 2021: 1637-1645.
[12]	LI Z , WANG Y C , ZHANG N ,et al. Deep learning-based object detection techniques for remote sensing images:a survey[J]. Remote Sensing, 2022,14(10): 2385.
[13]	ZHAO Z J , WEI S T , CHEN Q C ,et al. Masked retraining teacher-student framework for domain adaptive object detection[C]// Proceedings of the IEEE International Conference on Computer Vision. Piscataway:IEEE Press, 2023: 1-12.
[14]	LU Y W , KUMAR K M , NABAVI S S ,et al. Future frame prediction using convolutional VRNN for anomaly detection[C]// Proceedings of the 16th IEEE International Conference on Advanced Video and Signal Based Surveillance. Piscataway:IEEE Press, 2019: 1-8.
[15]	GONG D , LIU L Q , LE V ,et al. Memorizing normality to detect anomaly:memory-augmented deep autoencoder for unsupervised anomaly detection[C]// Proceedings of IEEE/CVF International Conference on Computer Vision. Piscataway:IEEE Press, 2020: 1705-1714.
[16]	PARK H , NOH J , HAM B . Learning memory-guided normality for anomaly detection[C]// Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2020: 14360-14369.
[17]	NGUYEN T N , MEUNIER J . Anomaly detection in video sequence with appearance-motion correspondence[C]// Proceedings of IEEE/CVF International Conference on Computer Vision. Piscataway:IEEE Press, 2020: 1273-1283.
[18]	LIU Z , WU X M , ZHENG D ,et al. Generating anomalies for video anomaly detection with prompt-based feature mapping[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2023: 24500-24510.
[19]	WANG X , ZHANG S , CEN J ,et al. CLIP-guided prototype modulating for few-shot action recognition[J]. arXiv Preprint,arXiv:2303.02982, 2023.
[20]	KIM J , GRAUMAN K . Observe locally,infer globally:a space-time MRF for detecting abnormal activities with incremental updates[C]// Proceedings of 2009 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2009: 2921-2928.
[21]	SAUNSHI N . Towards understanding self-supervised representation learning[D]. Princeton:Princeton University, 2022.
[22]	WANG Y Z , QIN C , BAI Y ,et al. Making reconstruction-based method great again for video anomaly detection[C]// Proceedings of IEEE International Conference on Data Mining (ICDM). Piscataway:IEEE Press, 2023: 1215-1220.
[23]	KRIZHEVSKY A , SUTSKEVER I , HINTON G E . ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017,60(6): 84-90.
[24]	MEDEL J R , SAVAKIS A . Anomaly detection in video using predictive convolutional long short-term memory networks[J]. arXiv Preprint,arXiv:1612.00390, 2016.
[25]	HASAN M , CHOI J , NEUMANN J ,et al. Learning temporal regularity in video sequences[C]// Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2016: 733-742.
[26]	ZHAO Y R , DENG B , SHEN C ,et al. Spatio-temporal autoencoder for video anomaly detection[C]// Proceedings of the 25th ACM International Conference on Multimedia. New York:ACM Press, 2017: 1933-1941.
[27]	LIU W , LI R , ZHENG M ,et al. Towards visually explaining variational autoencoders[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2020: 8642-8651.
[28]	SELVARAJU R R , COGSWELL M , DAS A ,et al. Grad-CAM:visual explanations from deep networks via gradient-based localization[C]// Proceedings of 2017 IEEE International Conference on Computer Vision. Piscataway:IEEE Press, 2017: 618-626.
[29]	VENKATARAMANAN S , PENG K C , SINGH R V ,et al. Attention guided anomaly localization in images[C]// Proceedings of European Conference on Computer Vision. Berlin:Springer, 2020: 485-503.
[30]	KIMURA D , CHAUDHURY S , NARITA M ,et al. Adversarial discriminative attention for robust anomaly detection[C]// Proceedings of 2020 IEEE Winter Conference on Applications of Computer Vision. Piscataway:IEEE Press, 2020: 2161-2170.
[31]	KINGMA D P , WELLING M . Auto-encoding variational Bayes[J]. arXiv Preprint,arXiv:1312.6114, 2013.
[32]	ZHAO B , LI F F , XING E P . Online detection of unusual events in videos via dynamic sparse coding[C]// Proceedings of Computer Vision＆ Pattern Recognition. Piscataway:IEEE Press, 2011: 3313-3320.
[33]	VASWANI N , ROY-CHOWDHURY A K , CHELLAPPA R . “Shape Activity”:a continuous-state HMM for moving/deforming shapes with application to abnormal activity detection[J]. IEEE Transactions on Image Processing, 2005,14(10): 1603-1616.
[34]	MAHADEVAN V , LI W X , BHALODIA V ,et al. Anomaly detection in crowded scenes[C]// Proceedings of 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2010: 1975-1981.
[35]	CHENG K W , CHEN Y T , FANG W H . Video anomaly detection and localization using hierarchical feature representation and Gaussian process regression[C]// Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway:IEEE Press, 2015: 2909-2917.
[36]	RUFF L , VANDERMEULEN R A , G?RNITZ N , et al . Deep oneclass classification[C]// Proceedings of International Conference on Machine Learning. New York:PMLR, 2018: 4393-4402.
[37]	SCH?LKOPF B , PLATT J C , SHAWE-TAYLOR J , ,et al. Estimating the support of a high-dimensional distribution[J]. Neural Computation, 2001,13(7): 1443-1471.
[38]	PURWANTO D , PRAMONO R R A , CHEN Y T ,et al. Corrections to“three-stream network with bidirectional self-attention for action recognition in extreme low resolution videos”[J]. IEEE Signal Processing Letters, 2020,27:2188.
[39]	ZHOU J T , ZHANG L , FANG Z W ,et al. Attention-driven loss for anomaly detection in video surveillance[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2020,30(12): 4639-4647.
[40]	HU C , WU F , WU W ,et al. Normal learning in videos with attention prototype network[J]. arXiv Preprint,arXiv:2108.11055, 2021.
[41]	YANG Z , LIU J , WU Z ,et al. Video event restoration based on keyframes for video anomaly detection[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2023: 14592-14601.
[42]	LUO W X , LIU W , LIAN D Z ,et al. Future frame prediction network for video anomaly detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022,44(11): 7505-7520.
[43]	YU G , WANG S Q , CAI Z P ,et al. Cloze test helps:effective video anomaly detection via learning to complete video events[C]// Proceedings of the 28th ACM International Conference on Multimedia. New York:ACM Press, 2020: 583-591.
[44]	LIU Z A , NIE Y W , LONG C J ,et al. A hybrid video anomaly detection framework via memory-augmented flow reconstruction and flow-guided frame prediction[C]// Proceedings of IEEE/CVF International Conference on Computer Vision. Piscataway:IEEE Press, 2022: 13568-13577.
[45]	CHANG Y , TU Z , XIE W ,et al. Video anomaly detection with spatio-temporal dissociation[J]. Pattern Recognition, 2022,122:108213.
[46]	LE V T , KIM Y G . Attention-based residual autoencoder for video anomaly detection[J]. Applied Intelligence, 2023,53(3): 3240-3254.
[47]	TANG Y , ZHAO L , ZHANG S ,et al. Integrating prediction and reconstruction for anomaly detection[J]. Pattern Recognition Letters, 2020,129: 123-130.
[48]	RONNEBERGER O , FISCHER P , BROX T . U-Net:convolutional networks for biomedical image segmentation[C]// Proceedings of International Conference on Medical Image Computing and Computer-Assisted Intervention. Berlin:Springer, 2015: 234-241.
[49]	LIU C Y , XU X Y , ZHANG Y J . Temporal attention network for action proposal[C]// Proceedings of 2018 25th IEEE International Conference on Image Processing. Piscataway:IEEE Press, 2018: 2281-2285.
[50]	WANG X L , GIRSHICK R , GUPTA A ,et al. Non-local neural networks[C]// Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2018: 7794-7803.
[51]	MATHIEU M , COUPRIE C , LECUN Y . Deep multi-scale video prediction beyond mean square error[J]. arXiv Preprint,arXiv:1511.05440, 2015.
[52]	LU C W , SHI J P , JIA J Y . Abnormal event detection at 150 FPS in MATLAB[C]// Proceedings of 2013 IEEE International Conference on Computer Vision. Piscataway:IEEE Press, 2014: 2720-2727.
[53]	LUO W X , LIU W , GAO S H . A revisit of sparse coding based anomaly detection in stacked RNN framework[C]// Proceedings of 2017 IEEE International Conference on Computer Vision. Piscataway:IEEE Press, 2017: 341-349.
[54]	GIORNO A D , BAGNELL J A , HEBERT M . A discriminative framework for anomaly detection in large videos[C]// European Conference on Computer Vision. Cham:Springer, 2016: 334-349.
[55]	LUO W X , LIU W , GAO S H . Remembering history with convolutional LSTM for anomaly detection[C]// Proceedings of 2017 IEEE International Conference on Multimedia and Expo. Piscataway:IEEE Press, 2017: 439-444.
[56]	SUN S , GONG X . Hierarchical semantic contrast for scene-aware video anomaly detection[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2023: 22846-22856.
[57]	IONESCU R T , SMEUREANU S , ALEXE B ,et al. Unmasking the abnormal events in video[C]// Proceedings of 2017 IEEE International Conference on Computer Vision. Piscataway:IEEE Press, 2017: 2914-2922.
[58]	LIU W , LUO W X , LIAN D Z ,et al. Future frame prediction for anomaly detection - A new baseline[C]// Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2018: 6536-6545.

方法		AUC
方法	UCSD Ped2数据集	Avenue数据集	ShanghaiTech数据集
MPPCA^[20]	69.3%	—	—
MPPCA+SFA^[20]	61.3%	—	—
MDT^[34]	82.9%	—	—
DFAD^[54]	—	78.3%	—
Conv AE^[30]	85.0%	80.0%	60.9%
ConvLSTM-AE^[55]	88.1%	77.0%	—
AE-Conv3D^[26]	91.2%	77.1%	—
Unmasking^[57]	82.2%	80.6%	—
TSC^[53]	91.0%	80.6%	67.9%
Stacked RNN^[53]	92.2%	81.7%	68%
Frame-Pred^[58]	95.4%	84.9%	72.8%
MemAE^[15]	94.1%	83.3%	71.2%
AMC^[17]	96.2%	86.9%	—
MNAD^[16]	97%	88.5%	70.5%
IPR^[47]	96.2%	83.7%	71.5%
USTN-DSC^[41]	98.1%	89.9%	73.8%
HSC^[56]	98.1%	92.4%	83.4%
所提方法	97.4%	86.8%	73.2%

模块组件	AUC
U-Net	√	√	√	√
全局注意力模块	×	√	×	√
局部注意力模块	×	×	√	√
UCSD Ped2	95.2%	95.9%	96.8%	97.4%
Avenue	82.8%	83.3%	85.4%	86.8%

结构组件	AUC
4层U-Net	97.1%
5层U-Net	97.4% (↑0.3%)
单编码器	96.6%
双编码器	97.4% (↑0.8%)

基于全局-局部自注意力网络的视频异常检测方法

Novel video anomaly detection method based on global-local self-attention network

在线阅读

PDF下载

可视化

摘要/Abstract

引用本文

使用本文

图/表 7

参考文献 58

相关文章 15

Metrics

推荐阅读 0

[1]	陈发堂, 张若凡. 可重构智能反射面辅助的车联网资源分配算法研究[J]. 通信学报, 2023, 44(9): 70-78.
[2]	魏德宾, 潘成胜, 杨力, 颜佐任. 基于网络流量水平等级预测的自适应随机早期检测算法[J]. 通信学报, 2023, 44(6): 154-166.
[3]	罗智勇, 张玉, 王青, 宋伟伟. 基于贝叶斯攻击图的SDN入侵意图识别算法的研究[J]. 通信学报, 2023, 44(4): 216-225.
[4]	仪双燕, 梁永生, 陆晶晶, 柳伟, 胡涛, 何震宇. 联合低秩重构和投影重构的稳健特征选择方法[J]. 通信学报, 2023, 44(3): 209-219.
[5]	舒坚, 史佳伟, 刘琳岚, Manar Al-Kali. 基于时空卷积的机会网络拓扑预测[J]. 通信学报, 2023, 44(3): 145-156.
[6]	陈发堂, 刘小玲, 王丹, 张若凡. RIS辅助太赫兹频段车载网络容量优化[J]. 通信学报, 2023, 44(10): 103-111.
[7]	鲁斌, 孙洋, 杨振宇. 基于原始点云网格自注意力机制的三维目标检测方法[J]. 通信学报, 2023, 44(10): 72-84.
[8]	侯进, 陈鑫强. 基于几何序列分解与稀疏重构的DOA估计[J]. 通信学报, 2023, 44(1): 153-163.
[9]	王鼎, 高卫港, 吴志东. 基于稀疏贝叶斯的阵列幅相和互耦误差联合校正方法[J]. 通信学报, 2022, 43(9): 112-120.
[10]	吴友情, 马文静, 殷赵霞, 彭银银, 张新鹏. 基于预测误差位平面压缩的密文图像可逆信息隐藏[J]. 通信学报, 2022, 43(8): 219-230.
[11]	曾嵘, 杭潇. 车联网环境下可重构智能反射面辅助无线信道估计算法[J]. 通信学报, 2022, 43(8): 142-150.
[12]	刘建勋, 丁领航, 康国胜, 曹步清, 肖勇. 基于特征深度融合的Web服务QoS联合预测[J]. 通信学报, 2022, 43(7): 215-226.
[13]	郭海燕, 杨震, 邹玉龙, 吕斌, 冯蕴天, 赵玉娟. 基于主被动波束成形联合优化的双RIS辅助抗干扰通信方法[J]. 通信学报, 2022, 43(7): 21-30.
[14]	龙华, 苏树盟. 高阶最优LPC根值筛选的共振峰估计算法研究[J]. 通信学报, 2022, 43(6): 235-245.
[15]	钱榕, 许建婷, 张克君, 董宏宇, 邢方远. 隐马尔可夫模型的异质网络链接预测方法研究[J]. 通信学报, 2022, 43(5): 214-225.