基于对抗补丁的可泛化的Grad-CAM攻击方法

doi:10.11959/j.issn.1000-436x.2021025

Abstract

Abstract:

To verify the fragility of the Grad-CAM, a Grad-CAM attack method based on adversarial patch was proposed.By adding a constraint to the Grad-CAM in the classification loss function, an adversarial patch could be optimized and the adversarial image could be synthesized.The adversarial image guided the Grad-CAM interpretation result towards the patch area while the classification result remains unchanged, so as to attack the interpretations.Meanwhile, through batch-training on the dataset and increasing perturbation norm constraint, the generalization and the multi-scene usability of the adversarial patch were improved.Experimental results on the ILSVRC2012 dataset show that compared with the existing methods, the proposed method can attack the interpretation results of the Grad-CAM more simply and effectively while maintaining the classification accuracy.

Key words: convolutional neural network, interpretability, adversarial patch, class activation map, saliency map

CLC Number:

TP391

Nianwen SI, Wenlin ZHANG, Dan QU, Heyu CHANG, Shengxiang LI, Tong NIU. Generalized Grad-CAM attacking method based on adversarial patch[J]. Journal on Communications, 2021, 42(3): 23-35.

Figures/Tables 11

References 25

[1]	SIMONYAN K , ZISSERMAN A . Very deep convolutional networks for large-scale image recognition[J]. arXiv Preprint,arXiv:1409.1556v6, 2014.
[2]	HE K M , ZHANG X Y , REN S Q ,et al. Deep residual learning for image recognition[C]// IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2016: 770-778.
[3]	HUANG G , LIU Z , MAATEN L V D ,et al. Densely connected convolutional networks[C]// IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2017: 2261-2269.
[4]	VASWANI A , SHAZEER N , PARMAR N ,et al. Attention is all you need[J]. arXiv Preprint,arXiv:1706.03762v5, 2017.
[5]	DEVLIN J , CHANG M W , LEE K ,et al. Bert:pre-training of deep bidirectional transformers for language understanding[J]. arXiv Preprint,arXiv:1810.04805, 2018.
[6]	SIMONYAN K , VEDALDI A , ZISSERMAN A . Deep inside convolutional networks:visualising image classification models and saliency maps[J]. arXiv Preprint,arXiv:1312.6034, 2013.
[7]	SPRINGENBERG J T , DOSOVITSKIY A , BROX T ,et al. Striving for simplicity:the all convolutional net[J]. arXiv Preprint,arXiv:1412.6806, 2014.
[8]	SMILKOV D , THORAT N , KIM B ,et al. SmoothGrad:removing noise by adding noise[J]. arXiv Preprint,arXiv:1706.03825, 2017.
[9]	SUNDARARAJAN M , TALY A , YAN Q Q . Axiomatic attribution for deep networks[J]. arXiv Preprint,arXiv:1703.01365, 2017.
[10]	ZHOU B , KHOSLA A , LAPEDRIZA A ,et al. Learning deep features for discriminative localization[C]// IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2016: 2921-2929.
[11]	SELVARAJU R R , COGSWELL M , DAS A ,et al. Grad-CAM:visual explanations from deep networks via gradient-based localization[C]// IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2017: 618-626.
[12]	CHATTOPADHAY A , SARKAR A , HOWLADER P ,et al. Grad-CAM++:generalized gradient-based visual explanations for deep convolutional networks[C]// 2018 IEEE Winter Conference on Applications of Computer Vision. Piscataway:IEEE Press, 2018: 839-847.
[13]	WANG H F , DU M N , YANG F ,et al. Score-CAM:improved visual explanations via score-weighted class activation mapping[J]. arXiv Preprint,arXiv:1910.01279, 2019.
[14]	GHORBANI A , ABID A , ZOU J . Interpretation of neural networks is fragile[C]// Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto:AAAI Press, 2018: 3681-3688.
[15]	DOMBROWSKI A K , ALBER M , ANDERS C ,et al. Explanations can be manipulated and geometry is to blame[J]. arXiv Preprint,arXiv:1906.07983, 2019.
[16]	HEO J , JOO S , MOON T . Fooling neural network interpretations via adversarial model manipulation[J]. arXiv Preprint,arXiv:1902.02041, 2019.
[17]	BROWN T B , MANé D , ROY A ,et al. Adversarial patch[J]. arXiv Preprint,arXiv:1712.09665v2, 2017.
[18]	RUSSAKOVSKY O , DENG J , SU H ,et al. ImageNet large scale visual recognition challenge[J]. International Journal of Computer Vision, 2015,115(3): 211-252.
[19]	FUKUI H , HIRAKAWA T , YAMASHITA T ,et al. Attention branch network:learning of attention mechanism for visual explanation[C]// IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2019: 10705-10714.
[20]	LI K P , WU Z Y , PENG K C ,et al. Tell me where to look:guided attention inference network[C]// IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2018: 9215-9223.
[21]	SUBRAMANYA A , PILLAI V , PIRSIAVASH H . Fooling network interpretation in image classification[C]// IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2019: 2020-2029.
[22]	SZEGEDY C , ZAREMBA W , SUTSKEVER I ,et al. Intriguing properties of neural networks[J]. arXiv Preprint,arXiv:1312.6199v4, 2013.
[23]	GOODFELLOW I J , SHLENS J , SZEGEDY C . Explaining and harnessing adversarial examples[J]. arXiv Preprint,arXiv:1412.6572v3, 2014.
[24]	PASZKE A , GROSS S , CHINTALA S ,et al. Automatic differentiation in PyTorch[C]// Advances in Neural Information Processing Systems Workshop. Massachusetts:MIT Press, 2017: 1-4.
[25]	DONG Y P , LIAO F Z , PANG T Y ,et al. Boosting adversarial attacks with momentum[C]// IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2018: 9185-9193.

Metrics

Recommended 0

No Suggested Reading articles found!

方法	top1准确率	ER_p	ER_b
原图	92.70%	4.85%	67.76%
对抗性微调方法	87.90%	71.13% ↑	35.42% ↓
对抗补丁方法（本文方法）	92.50%	67.19% ↑	38.52% ↓

模型	方法	top1准确率	ER_p	ER_b
VGGNet-16	原图	90.60%	5.67%	62.05%
	对抗图像（本文方法）	89.30%	64.94% ↑	40.19% ↓
VGGNet-19-BN	原图	92.70%	4.85%	67.76%
	对抗图像（本文方法）	92.50%	67.19% ↑	38.52% ↓
ResNet-50	原图	94.50%	2.97%	63.99%
	对抗图像（本文方法）	94.40%	33.73% ↑	47.32% ↓
DenseNet-161	原图	96.60%	4.00%	65.37%
	对抗图像（本文方法）	96.20%	38.51% ↑	45.41% ↓

类别	单张图像的对抗补丁		可泛化的通用对抗补丁
类别	ER_p	ER_b	ER_p	ER_b
airliner	67.32%	32.24%	68.56% ↑	30.26% ↓
sports_car	63.56%	35.61%	65.32% ↑	34.12% ↓
indigo_bunting	68.39%	14.99%	70.45% ↑	13.79% ↓
tabby	69.51%	37.26%	71.23% ↑	36.56% ↓
hartebeest	50.07%	13.28%	51.89% ↑	12.76% ↓
golden_retriever	62.17%	26.51%	63.78% ↑	25.78% ↓
bullfrog	59.34%	19.30%	60.45% ↑	18.32% ↓
sorrel	65.86%	32.19%	67.49% ↑	31.37% ↓
speedboat	63.39%	26.27%	65.02% ↑	24.53% ↓
pickup	67.85%	37.16%	69.30% ↑	36.52% ↓

模型	方法		左上角			右下角			四周
模型	方法	top1准确率	ER_p	ER_b	top1准确率	ER_p	ER_b	top1准确率	ER_p	ER_b
VGGNet-16	原图	90.60%	5.67%	62.05%	90.60%	6.23%	62.05%	90.60%	9.38%	62.05%
	对抗图像（本文方法）	90.60%	95.75% ↑	29.26% ↓	90.40%	95.76% ↑	35.68% ↓	90.60%	96.32% ↑	40.21% ↓
VGGNet19-BN	原图	92.70%	4.85%	67.76%	92.70%	5.39%	67.76%	92.70%	9.24%	67.76%
	对抗图像（本文方法）	92.70%	97.08% ↑	26.62% ↓	92.70%	96.74% ↑	33.44% ↓	92.70%	96.13% ↑	39.25% ↓

Generalized Grad-CAM attacking method based on adversarial patch

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 11

References 25

Related Articles 15

Metrics

Recommended 0

[1]	Yurong LIAO, Haining WANG, Cunbao LIN, Yang LI, Yuqiang FANG, Shuyan NI. Research progress of deep learning-based object detection of optical remote sensing image [J]. Journal on Communications, 2022, 43(5): 190-203.
[2]	Fan ZHANG, Yun HUANG, Zizhuo FANG, Wei GUO. Lost-minimum post-training parameter quantization method for convolutional neural network [J]. Journal on Communications, 2022, 43(4): 114-122.
[3]	Zhengyu ZHU, Gengwang HOU, Chongwen HUANG, Gangcan SUN, Wanming HAO, Jing LIANG. Systems resource allocation algorithm for RIS-assisted D2D secure communication based on parallel CNN [J]. Journal on Communications, 2022, 43(3): 172-179.
[4]	Hongyan WANG, Libin ZHANG, Guoqiang CHEN, Zumin WANG, Zhiyuan GUAN. Approach of target tracking combining particle filter and metric learning [J]. Journal on Communications, 2021, 42(5): 98-110.
[5]	Hongmin GAO, Xueying CAO, Zhonghao CHEN, Zaijun HUA, Chenming LI, Yue CHEN. Hyperspectral image classification method based on multi-scale proximal feature concatenate network [J]. Journal on Communications, 2021, 42(2): 92-102.
[6]	Jun YANG,Jisheng DANG. Semantic segmentation of 3D point cloud based on contextual attention CNN [J]. Journal on Communications, 2020, 41(7): 195-203.
[7]	Hongmin GAO,Xueying CAO,Yao YANG,Zaijun HUA,Chenming LI. Application of bilateral fusion model based on CNN in hyperspectral image classification [J]. Journal on Communications, 2020, 41(11): 132-140.
[8]	Meng ZHANG,Haoliang SUN,Peng YANG. Identification of DNS covert channel based on improved convolutional neural network [J]. Journal on Communications, 2020, 41(1): 169-179.
[9]	Xin ZHOU,Xiaoxin HE,Changwen ZHENG. Radio signal recognition based on image deep learning [J]. Journal on Communications, 2019, 40(7): 114-125.
[10]	Jia LI,Xiaochun YUN,Shuhao LI,Yongzheng ZHANG,Jiang XIE,Fang FANG. HTTP malicious traffic detection method based on hybrid structure deep neural network [J]. Journal on Communications, 2019, 40(1): 24-33.
[11]	Linhui LI,Bo QIAN,Jing LIAN,Weina ZHENG,Yafu ZHOU. Study on traffic scene semantic segmentation method based on convolutional neural network [J]. Journal on Communications, 2018, 39(4): 123-130.
[12]	Hengjun WANG,Nianwen SI,Yulong SONG,Yidong SHAN. Neural network model for dependency parsing incorporating global vector feature [J]. Journal on Communications, 2018, 39(2): 53-64.
[13]	Wanliang WANG,Zhuorong LI. Advances in generative adversarial network [J]. Journal on Communications, 2018, 39(2): 135-148.
[14]	Yong WANG,Huiyi ZHOU,Hao FENG,Miao YE,Wenlong KE. Network traffic classification method basing on CNN [J]. Journal on Communications, 2018, 39(1): 14-23.
[15]	Bing-lin ZHAO,Xi MENG,Jin HAN,Jing WANG,Fu-dong LIU. Homology analysis of malware based on graph [J]. Journal on Communications, 2017, 38(Z2): 86-93.