基于图像重构的MNIST对抗样本防御算法

doi:10.11959/j.issn.2096-109x.2021095

摘要/Abstract

摘要：

随着深度学习的应用普及，其安全问题越来越受重视，对抗样本是在原有图像中添加较小的扰动，即可造成深度学习模型对图像进行错误分类，这严重影响深度学习技术的发展。针对该问题，分析现有对抗样本的攻击形式和危害，由于现有防御算法存在缺点，提出一种基于图像重构的对抗样本防御方法，以达到有效防御对抗样本的目的。该防御方法以 MNIST 为测试数据集，核心思路是图像重构，包括中心方差最小化和图像缝合优化，中心方差最小化只针对图像中心区域进行处理；图像缝合优化将重叠区域纳入补丁块选取的考量，并以补丁块的1/2大小作为重叠区域。使用FGSM、BIM、DeepFool以及C＆amp;W攻击方式生成对抗样本来测试两种方式的防御性能，并与现有的3种图像重构防御方式（裁剪与缩放、位深度压缩和JPEG 压缩）效果对比。实验结果表明，所提中心方差最小化和图像缝合优化算法，对现有常见对抗样本的攻击起到了较好的防御效果。图像缝合优化对4种攻击算法生成的样本分类正确率都达到了75%以上，中心方差最小化的防御效果在70%左右。而用作对比的3种图像重构算法则对不同攻击算法的防御效果不稳定，整体分类正确率不足60%。所提中心方差最小化和图像缝合优化两种图像重构防御算法达到了有效防御对抗样本的目的，通过实验说明了所提防御算法在不同对抗样本攻击算法中的防御效果，另外，将其他图像重构算法与所提算法进行比较，说明了所提算法具有良好的防御性能。

关键词: 对抗样本, 图像重构, 深度学习, 图像分类

Abstract:

With the popularization of deep learning, more and more attention has been paid to its security issues.The adversarial sample is to add a small disturbance to the original image, which can cause the deep learning model to misclassify the image, which seriously affects the performance of deep learning technology.To address this challenge, the attack form and harm of the existing adversarial samples were analyzed.An adversarial examples defense method based on image reconstruction was proposed to effectively detect adversarial examples.The defense method used MNIST as the test data set.The core idea was image reconstruction, including central variance minimization and image quilting optimization.The central variance minimization was only processed for the central area of the image.The image quilting optimization incorporated the overlapping area into the patch block selection.Considered and took half the size of the patch as the overlap area.Using FGSM, BIM, DeepFool and C＆amp;W attack methods to generate adversarial samples to test the defense performance of the two methods, and compare with the existing three image reconstruction defense methods (cropping and scaling, bit depth compression and JPEG compression).The experimental results show that the central variance minimization and image quilting optimization algorithms proposed have a satisfied defense effect against the attacks of existing common adversarial samples.Image quilting optimization achieves over 75% classification accuracy for samples generated by the four attack algorithms, and the defense effect of minimizing central variance is around 70%.The three image reconstruction algorithms used for comparison have unstable defense effects on different attack algorithms, and the overall classification accuracy rate is less than 60%.The central variance minimization and image quilting optimization proposed achieve the purpose of effectively defending against adversarial samples.The experiments illustrate the defense effect of the proposed defense algorithm in different adversarial sample attack algorithms.The comparison between the reconstruction algorithm and the algorithm shows that the proposed scheme has good defense performance.

Key words: adversarial example, image reconstruction, deep learning, image classification

中图分类号:

TP393

秦中元, 贺兆祥, 李涛, 陈立全. 基于图像重构的MNIST对抗样本防御算法[J]. 网络与信息安全学报, 2022, 8(1): 86-94.

Zhongyuan QIN, Zhaoxiang HE, Tao LI, Liquan CHEN. Adversarial example defense algorithm for MNIST based on image reconstruction[J]. Chinese Journal of Network and Information Security, 2022, 8(1): 86-94.

图/表 11

图1

图2

图3

图4

表1

表2

图5

表3

表4

表5

图6

参考文献 19

[1]	SIMONVAN K , ZISSERMAN A . Very deep convolutional networks for large-scale image recognition[C]// Proceedings of 3rd International Conference on Learning Representations (ICLR 2015). 2015.
[2]	EYKHOLT K , EVTIMOV I , FERNANDES E ,et al. Robust physical-world attacks on deep learning visual classification[C]// Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2018.
[3]	SZEGEDY C , ZAREMBA W , SUTSKEVER I ,et al. Intriguing properties of neural networks[C]// Proceedings of 2nd International Conference on Learning Representations (ICLR 2014). 2014.
[4]	SHARIF M , BHAGAVATULA S , BAUER L ,et al. Accessorize to a crime:Real and stealthy attacks on state-of-the-art face recognition[C]// Proceedings of the 2016 ACM SIGSAC Conference. 2016.
[5]	PRASAD A , SUGGALA A S , BALAKRISHNAN S ,et al. Robust estimation via robust gradient estimation[J]. Journal of the Royal Statistical Society Series B, 2020,82(3): 601-627.
[6]	宋蕾, 马春光, 段广晗 . 机器学习安全及隐私保护研究进展[J]. 网络与信息安全学报, 2018,4(8): 1-11.
	SONG L , MA C G , DUAN G H . Machine learning security and privacy:a survey[J]. Chinese Journal of Network and Information Security, 2018,4(8): 1-11.
[7]	GOODFELLOW I J , SHLENS J , SZEGEDY C . Explaining and harnessing adversarial examples[C]// Proceedings of 3rd International Conference on Learning Representations (ICLR 2015). 2015.
[8]	KURAKIN A , GOODFELLOW I J , BENGIO S . Adversarial examples in the physical world[C]// Proceedings of 5th International Conference on Learning Representations (ICLR 2017). 2017.
[9]	MOOSAVI-DEZFOOLI S , FAWZI A , FROSSARD P . DeepFool:a simple and accurate method to fool deep neural networks[C]// Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2016: 2574-2582.
[10]	CARLINI N , WAGNER D . Towards evaluating the robustness of neural networks[C]// Proceedings of 2017 IEEE Symposium on Security and Privacy (SP). 2017.
[11]	LECUN Y , BOTTOU L . Gradient-based learning applied to document recognition[C]// Proceedings of the IEEE. 1998: 2278-2324.
[12]	SZEGEDY C , LIU W , JIA Y ,et al. Going deeper with convolutions[C]// Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015). 2015.
[13]	刘西蒙, 谢乐辉, 王耀鹏 ,等. 深度学习中的对抗攻击与防御[J]. 网络与信息安全学报, 2020,6(5): 36-53.
	LIU X M , XIE L H , WANG Y P ,et al. Adversarial at tacks and defenses in deep learning[J]. Chinese Journal of Network and Information Security, 2020,6(5): 36-53.
[14]	DZIUGAITE G K , GHAHRAMANI Z , ROY D M . A study of the effect of JPG compression on adversarial images[C]// Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016.
[15]	严飞, 张铭伦, 张立强 . 基于边界值不变量的对抗样本检测方法[J]. 网络与信息安全学报, 2020,6(1): 38-45.
	YAN F , ZHANG M L , ZHANG L Q . Adversarial examples detection method based on boundary values invariants[J]. Chinese Journal of Network and Information Security, 2020,6(1): 38-45.
[16]	LU J . No need to worry about adversarial examples in object detection in autonomous vehicles[EB].
[17]	GUO C , RANA M , CISSE M ,et al. Countering adversarial images using input transformations[C]// The 6th International Conference on Learning Representations (ICLR 2018). 2018.
[18]	EFROS A A , FREEMAN W T . Image quilting for texture synthesis and transfer[C]// Proceedings of the ACM SIGGRAPH Conference on Computer Graphics. 2001: 341-346.
[19]	BOYKOV Y , VEKSLER O , ZABIH R . Fast approximate energy minimization via graph cuts[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2001,23(11): 1222-1239.

算法对比	数据集	尺寸与数量	运行时间/min	分类正确率
总方差	MNIST	28×28×100	257.2	86%
中心方差	MNIST	14×14×100	59.8	84%

防御的攻击方式	分类正确率
FGSM	68.78%
BIM	69.30%
DeepFool	69.34%
C＆W	68.10%

攻击方法	分类正确率
No attack	86.75%
FGSM	78.51%
BIM	79.73%
DeepFool	79.87%
C＆W	77.72%

算法名称	参数	参数含义	参数取值
裁剪和缩放	size	保留尺寸	24像素
位深度压缩	depth	一个像素点所需位数	6 bit
JPEG压缩	quality	压缩质量	80%
中心方差最小化	size	重构的中心图像尺寸	14像素×14像素
图像缝合优化	size block	补丁块大小重叠区域	4像素×4像素位置确定

图像重构算法	MNIS图像数量/幅	分类正确率
裁剪和缩放	10 000	76.43%
位深度压缩	10 000	90.46%
JPEG压缩	10 000	91.87%
中心方差最小化	100	84%
图像缝合优化	10 000	86.75%