基于全局-局部自注意力网络的视频异常检测方法

doi:10.11959/j.issn.1000-436x.2023151

Abstract

Abstract:

In order to improve the accuracy of video anomaly detection, a novel video anomaly detection method based on global-local self-attention network was proposed.Firstly, the video sequence and the corresponding RGB sequence were fused to highlight the motion change of the object.Secondly, the temporal correlation of the video sequence in the local area was captured by the expansion convolution layer, along with the self-attention network was utilized to compute the global temporal dependencies of the video sequence.Meanwhile, by deepening the basic network U-Net and combining the relevant motion and representation constraints, the network model was trained end-to-end to improve the detection accuracy and robustness of the model.Finally, experiments were carried out on the public data sets UCSD Ped2, CUHK Avenue and ShanghaiTech, as well as the test results were visually analyzed.The experimental results show that the detection accuracy AUC of the proposed method reaches 97.4%, 86.8% and 73.2% respectively, which is obviously better than that of the compared methods.

Key words: video anomaly detection, self-attention, prediction, reconstruction

CLC Number:

TP391.41

Jing YANG, Chengmao WU, Liuping ZHOU. Novel video anomaly detection method based on global-local self-attention network[J]. Journal on Communications, 2023, 44(8): 241-250.

Figures/Tables 7

References 58

[1]	RAMACHANDRA B , JONES M J , VATSAVAI R R . A survey of single-scene video anomaly detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022,44(5): 2293-2312.
[2]	SINGH A , JONES M J , LEARNED-MILLER E G . EVAL:explainable video anomaly localization[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2023: 18717-18726.
[3]	SALIGRAMA V , KONRAD J , JODOIN P M . Video anomaly identification[J]. IEEE Signal Processing Magazine, 2010,27(5): 18-33.
[4]	LUO W X , LIU W , LIAN D Z ,et al. Video anomaly detection with sparse coding inspired deep neural networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021,43(3): 1070-1084.
[5]	?ENG?NüL E , SAMET R , ABU A Q ,et al. An analysis of artificial intelligence techniques in surveillance video anomaly detection:a comprehensive survey[J]. Applied Sciences, 2023,13(8): 49-56.
[6]	HU T , LONG C , XIAO C . CRD-CGAN:category-consistent and relativistic constraints for diverse text-to-image generation[J]. arXiv Preprint,arXiv:2107.13516, 2021.
[7]	THAKARE K V , RAGHUWANSHI Y , DOGRA D P ,et al. DyAnNet:A Scene Dynamicity Guided Self-Trained Video Anomaly Detection Network[C]// Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. Piscataway:IEEE Press, 2023: 5541-5550.
[8]	HU T , LONG C J , XIAO C X . A novel visual representation on text using diverse conditional GAN for visual recognition[J]. IEEE Transactions on Image Processing, 2021,30: 3499-3512.
[9]	ISLAM A , LONG C J , BASHARAT A ,et al. DOA-GAN:dual-order attentive generative adversarial network for image copy-move forgery detection and localization[C]// Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway:IEEE Press, 2020: 4675-4684.
[10]	GU T P , CHEN G Y , LI J L ,et al. Stochastic trajectory prediction via motion indeterminacy diffusion[C]// Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2022: 17092-17101.
[11]	ISLAM A , LONG C J , RADKE R . A hybrid attention mechanism for weakly-supervised temporal action localization[C]// Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto:AAAI Press, 2021: 1637-1645.
[12]	LI Z , WANG Y C , ZHANG N ,et al. Deep learning-based object detection techniques for remote sensing images:a survey[J]. Remote Sensing, 2022,14(10): 2385.
[13]	ZHAO Z J , WEI S T , CHEN Q C ,et al. Masked retraining teacher-student framework for domain adaptive object detection[C]// Proceedings of the IEEE International Conference on Computer Vision. Piscataway:IEEE Press, 2023: 1-12.
[14]	LU Y W , KUMAR K M , NABAVI S S ,et al. Future frame prediction using convolutional VRNN for anomaly detection[C]// Proceedings of the 16th IEEE International Conference on Advanced Video and Signal Based Surveillance. Piscataway:IEEE Press, 2019: 1-8.
[15]	GONG D , LIU L Q , LE V ,et al. Memorizing normality to detect anomaly:memory-augmented deep autoencoder for unsupervised anomaly detection[C]// Proceedings of IEEE/CVF International Conference on Computer Vision. Piscataway:IEEE Press, 2020: 1705-1714.
[16]	PARK H , NOH J , HAM B . Learning memory-guided normality for anomaly detection[C]// Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2020: 14360-14369.
[17]	NGUYEN T N , MEUNIER J . Anomaly detection in video sequence with appearance-motion correspondence[C]// Proceedings of IEEE/CVF International Conference on Computer Vision. Piscataway:IEEE Press, 2020: 1273-1283.
[18]	LIU Z , WU X M , ZHENG D ,et al. Generating anomalies for video anomaly detection with prompt-based feature mapping[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2023: 24500-24510.
[19]	WANG X , ZHANG S , CEN J ,et al. CLIP-guided prototype modulating for few-shot action recognition[J]. arXiv Preprint,arXiv:2303.02982, 2023.
[20]	KIM J , GRAUMAN K . Observe locally,infer globally:a space-time MRF for detecting abnormal activities with incremental updates[C]// Proceedings of 2009 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2009: 2921-2928.
[21]	SAUNSHI N . Towards understanding self-supervised representation learning[D]. Princeton:Princeton University, 2022.
[22]	WANG Y Z , QIN C , BAI Y ,et al. Making reconstruction-based method great again for video anomaly detection[C]// Proceedings of IEEE International Conference on Data Mining (ICDM). Piscataway:IEEE Press, 2023: 1215-1220.
[23]	KRIZHEVSKY A , SUTSKEVER I , HINTON G E . ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017,60(6): 84-90.
[24]	MEDEL J R , SAVAKIS A . Anomaly detection in video using predictive convolutional long short-term memory networks[J]. arXiv Preprint,arXiv:1612.00390, 2016.
[25]	HASAN M , CHOI J , NEUMANN J ,et al. Learning temporal regularity in video sequences[C]// Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2016: 733-742.
[26]	ZHAO Y R , DENG B , SHEN C ,et al. Spatio-temporal autoencoder for video anomaly detection[C]// Proceedings of the 25th ACM International Conference on Multimedia. New York:ACM Press, 2017: 1933-1941.
[27]	LIU W , LI R , ZHENG M ,et al. Towards visually explaining variational autoencoders[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2020: 8642-8651.
[28]	SELVARAJU R R , COGSWELL M , DAS A ,et al. Grad-CAM:visual explanations from deep networks via gradient-based localization[C]// Proceedings of 2017 IEEE International Conference on Computer Vision. Piscataway:IEEE Press, 2017: 618-626.
[29]	VENKATARAMANAN S , PENG K C , SINGH R V ,et al. Attention guided anomaly localization in images[C]// Proceedings of European Conference on Computer Vision. Berlin:Springer, 2020: 485-503.
[30]	KIMURA D , CHAUDHURY S , NARITA M ,et al. Adversarial discriminative attention for robust anomaly detection[C]// Proceedings of 2020 IEEE Winter Conference on Applications of Computer Vision. Piscataway:IEEE Press, 2020: 2161-2170.
[31]	KINGMA D P , WELLING M . Auto-encoding variational Bayes[J]. arXiv Preprint,arXiv:1312.6114, 2013.
[32]	ZHAO B , LI F F , XING E P . Online detection of unusual events in videos via dynamic sparse coding[C]// Proceedings of Computer Vision＆ Pattern Recognition. Piscataway:IEEE Press, 2011: 3313-3320.
[33]	VASWANI N , ROY-CHOWDHURY A K , CHELLAPPA R . “Shape Activity”:a continuous-state HMM for moving/deforming shapes with application to abnormal activity detection[J]. IEEE Transactions on Image Processing, 2005,14(10): 1603-1616.
[34]	MAHADEVAN V , LI W X , BHALODIA V ,et al. Anomaly detection in crowded scenes[C]// Proceedings of 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2010: 1975-1981.
[35]	CHENG K W , CHEN Y T , FANG W H . Video anomaly detection and localization using hierarchical feature representation and Gaussian process regression[C]// Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway:IEEE Press, 2015: 2909-2917.
[36]	RUFF L , VANDERMEULEN R A , G?RNITZ N , et al . Deep oneclass classification[C]// Proceedings of International Conference on Machine Learning. New York:PMLR, 2018: 4393-4402.
[37]	SCH?LKOPF B , PLATT J C , SHAWE-TAYLOR J , ,et al. Estimating the support of a high-dimensional distribution[J]. Neural Computation, 2001,13(7): 1443-1471.
[38]	PURWANTO D , PRAMONO R R A , CHEN Y T ,et al. Corrections to“three-stream network with bidirectional self-attention for action recognition in extreme low resolution videos”[J]. IEEE Signal Processing Letters, 2020,27:2188.
[39]	ZHOU J T , ZHANG L , FANG Z W ,et al. Attention-driven loss for anomaly detection in video surveillance[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2020,30(12): 4639-4647.
[40]	HU C , WU F , WU W ,et al. Normal learning in videos with attention prototype network[J]. arXiv Preprint,arXiv:2108.11055, 2021.
[41]	YANG Z , LIU J , WU Z ,et al. Video event restoration based on keyframes for video anomaly detection[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2023: 14592-14601.
[42]	LUO W X , LIU W , LIAN D Z ,et al. Future frame prediction network for video anomaly detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022,44(11): 7505-7520.
[43]	YU G , WANG S Q , CAI Z P ,et al. Cloze test helps:effective video anomaly detection via learning to complete video events[C]// Proceedings of the 28th ACM International Conference on Multimedia. New York:ACM Press, 2020: 583-591.
[44]	LIU Z A , NIE Y W , LONG C J ,et al. A hybrid video anomaly detection framework via memory-augmented flow reconstruction and flow-guided frame prediction[C]// Proceedings of IEEE/CVF International Conference on Computer Vision. Piscataway:IEEE Press, 2022: 13568-13577.
[45]	CHANG Y , TU Z , XIE W ,et al. Video anomaly detection with spatio-temporal dissociation[J]. Pattern Recognition, 2022,122:108213.
[46]	LE V T , KIM Y G . Attention-based residual autoencoder for video anomaly detection[J]. Applied Intelligence, 2023,53(3): 3240-3254.
[47]	TANG Y , ZHAO L , ZHANG S ,et al. Integrating prediction and reconstruction for anomaly detection[J]. Pattern Recognition Letters, 2020,129: 123-130.
[48]	RONNEBERGER O , FISCHER P , BROX T . U-Net:convolutional networks for biomedical image segmentation[C]// Proceedings of International Conference on Medical Image Computing and Computer-Assisted Intervention. Berlin:Springer, 2015: 234-241.
[49]	LIU C Y , XU X Y , ZHANG Y J . Temporal attention network for action proposal[C]// Proceedings of 2018 25th IEEE International Conference on Image Processing. Piscataway:IEEE Press, 2018: 2281-2285.
[50]	WANG X L , GIRSHICK R , GUPTA A ,et al. Non-local neural networks[C]// Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2018: 7794-7803.
[51]	MATHIEU M , COUPRIE C , LECUN Y . Deep multi-scale video prediction beyond mean square error[J]. arXiv Preprint,arXiv:1511.05440, 2015.
[52]	LU C W , SHI J P , JIA J Y . Abnormal event detection at 150 FPS in MATLAB[C]// Proceedings of 2013 IEEE International Conference on Computer Vision. Piscataway:IEEE Press, 2014: 2720-2727.
[53]	LUO W X , LIU W , GAO S H . A revisit of sparse coding based anomaly detection in stacked RNN framework[C]// Proceedings of 2017 IEEE International Conference on Computer Vision. Piscataway:IEEE Press, 2017: 341-349.
[54]	GIORNO A D , BAGNELL J A , HEBERT M . A discriminative framework for anomaly detection in large videos[C]// European Conference on Computer Vision. Cham:Springer, 2016: 334-349.
[55]	LUO W X , LIU W , GAO S H . Remembering history with convolutional LSTM for anomaly detection[C]// Proceedings of 2017 IEEE International Conference on Multimedia and Expo. Piscataway:IEEE Press, 2017: 439-444.
[56]	SUN S , GONG X . Hierarchical semantic contrast for scene-aware video anomaly detection[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2023: 22846-22856.
[57]	IONESCU R T , SMEUREANU S , ALEXE B ,et al. Unmasking the abnormal events in video[C]// Proceedings of 2017 IEEE International Conference on Computer Vision. Piscataway:IEEE Press, 2017: 2914-2922.
[58]	LIU W , LUO W X , LIAN D Z ,et al. Future frame prediction for anomaly detection - A new baseline[C]// Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2018: 6536-6545.

Metrics

Recommended 0

No Suggested Reading articles found!

方法		AUC
方法	UCSD Ped2数据集	Avenue数据集	ShanghaiTech数据集
MPPCA^[20]	69.3%	—	—
MPPCA+SFA^[20]	61.3%	—	—
MDT^[34]	82.9%	—	—
DFAD^[54]	—	78.3%	—
Conv AE^[30]	85.0%	80.0%	60.9%
ConvLSTM-AE^[55]	88.1%	77.0%	—
AE-Conv3D^[26]	91.2%	77.1%	—
Unmasking^[57]	82.2%	80.6%	—
TSC^[53]	91.0%	80.6%	67.9%
Stacked RNN^[53]	92.2%	81.7%	68%
Frame-Pred^[58]	95.4%	84.9%	72.8%
MemAE^[15]	94.1%	83.3%	71.2%
AMC^[17]	96.2%	86.9%	—
MNAD^[16]	97%	88.5%	70.5%
IPR^[47]	96.2%	83.7%	71.5%
USTN-DSC^[41]	98.1%	89.9%	73.8%
HSC^[56]	98.1%	92.4%	83.4%
所提方法	97.4%	86.8%	73.2%

模块组件	AUC
U-Net	√	√	√	√
全局注意力模块	×	√	×	√
局部注意力模块	×	×	√	√
UCSD Ped2	95.2%	95.9%	96.8%	97.4%
Avenue	82.8%	83.3%	85.4%	86.8%

结构组件	AUC
4层U-Net	97.1%
5层U-Net	97.4% (↑0.3%)
单编码器	96.6%
双编码器	97.4% (↑0.8%)

Novel video anomaly detection method based on global-local self-attention network

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 7

References 58

Related Articles 15

Metrics

Recommended 0

[1]	Debin WEI, Chengsheng PAN, Li YANG, Zuoren YAN. Adaptive random early detection algorithm based on network traffic level grade prediction [J]. Journal on Communications, 2023, 44(6): 154-166.
[2]	Zhiyong LUO, Yu ZHANG, Qing WANG, Weiwei SONG. Study of SDN intrusion intent identification algorithm based on Bayesian attack graph [J]. Journal on Communications, 2023, 44(4): 216-225.
[3]	Shuangyan YI, Yongsheng LIANG, Jingjing LU, Wei LIU, Tao HU, Zhenyu HE. Robust feature selection method via joint low-rank reconstruction and projection reconstruction [J]. Journal on Communications, 2023, 44(3): 209-219.
[4]	Jian SHU, Jiawei SHI, Linlan LIU, Al-Kali Manar. Topology prediction for opportunistic network based on spatiotemporal convolution [J]. Journal on Communications, 2023, 44(3): 145-156.
[5]	Bin LU, Yang SUN, Zhenyu YANG. Grid self-attention mechanism 3D object detection method based on raw point cloud [J]. Journal on Communications, 2023, 44(10): 72-84.
[6]	Jin HOU, Xinqiang CHEN. DOA estimation based on geometric sequence decomposition and sparse reconstruction [J]. Journal on Communications, 2023, 44(1): 153-163.
[7]	Yanwen WANG, Weimin LEI, Wei ZHANG, Huan MENG, Xinyi CHEN, Wenhui YE, Qingyang JING. Survey on video image reconstruction method based on generative model [J]. Journal on Communications, 2022, 43(9): 194-208.
[8]	Ding WANG, Weigang GAO, Zhidong WU. Array amplitude-phase and mutual coupling error joint correction method based on sparse Bayesian [J]. Journal on Communications, 2022, 43(9): 112-120.
[9]	Youqing WU, Wenjing MA, Zhaoxia YIN, Yinyin PENG, Xinpeng ZHANG. Reversible data hiding in encrypted image based on bit-plane compression of prediction error [J]. Journal on Communications, 2022, 43(8): 219-230.
[10]	Jianxun LIU, Linghang DING, Guosheng KANG, Buqing CAO, Yong XIAO. Joint QoS prediction for Web services based on deep fusion of features [J]. Journal on Communications, 2022, 43(7): 215-226.
[11]	Ang LI, Jianxin CHEN, Xin WEI, Liang ZHOU. 6G-oriented cross-modal signal reconstruction technology [J]. Journal on Communications, 2022, 43(6): 28-40.
[12]	Hua LONG, Shumeng SU. Research on formant estimation algorithm for high order optimal LPC root value screening [J]. Journal on Communications, 2022, 43(6): 235-245.
[13]	Xiaofeng FENG, Jianfeng XU, Chuan HE. Dynamic generalized principal component analysis with applications to fault subspace modeling [J]. Journal on Communications, 2022, 43(5): 92-101.
[14]	Rong QIAN, Jianting XU, Kejun ZHANG, Hongyu DONG, Fangyuan XING. Research on HMM based link prediction method in heterogeneous network [J]. Journal on Communications, 2022, 43(5): 214-225.
[15]	Junyan HUO, Danni WANG, Yanzhuo MA, Shuai WAN, Fuzheng YANG. Efficient cross-component prediction for H.266/VVC based on lightweight fully connected networks [J]. Journal on Communications, 2022, 43(2): 143-155.