基于PatchTracker的对抗补丁防御算法

doi:10.11959/j.issn.2096-109x.2024009

网络与信息安全学报 ›› 2024, Vol. 10 ›› Issue (1): 169-180.doi: 10.11959/j.issn.2096-109x.2024009

• 学术论文 • 上一篇

基于PatchTracker的对抗补丁防御算法

肖镇杰¹, 黄诗瑀¹, 叶锋¹^,², 黄丽清¹^,², 黄添强¹^,²

¹ 福建师范大学计算机与网络空间安全学院，福建福州 350117
² 数字福建大数据安全技术研究所，福建福州 350117

修回日期:2023-12-16 出版日期:2024-02-01 发布日期:2024-02-01
作者简介:肖镇杰（2000− ），男，福建泉州人，福建师范大学硕士生，主要研究方向为计算机视觉
黄诗瑀（1999− ），男，福建厦门人，福建师范大学硕士生，主要研究方向为对抗性深度学习和数字媒体取证
叶锋（1978− ），男，福建福州人，博士，福建师范大学副教授，主要研究方向为计算机视觉和视频图像编码
黄丽清（1991− ），女，福建莆田人，博士，福建师范大学讲师，主要研究方向为视频图像超分辨、去模糊处理和数字媒体取证
黄添强（1971− ），男，福建莆田人，博士，福建师范大学教授、博士生导师，主要研究方向为机器学习安全和数字媒体取证
基金资助:
国家自然科学基金(62072106);福建省创新战略研究计划项目(2023R0156)

Adversarial patch defense algorithm based on PatchTracker

Zhenjie XIAO¹, Shiyu HUANG¹, Feng YE¹^,², Liqing HUANG¹^,², Tianqiang HUANG¹^,²

¹ College of Computer and Cyber Security, Fujian Normal University, Fuzhou 350117, China
² Digital Fujian Institute of Big Data Security Technology, Fuzhou 350117, China

Revised:2023-12-16 Online:2024-02-01 Published:2024-02-01
Supported by:
The National Natural Science Foundation of China(62072106);Fujian Innovation Strategy Research Pro-gram Project(2023R0156)

摘要/Abstract

摘要：

基于深度神经网络的目标检测技术已经广泛应用于各领域，然而，通过对抗补丁攻击在图像中添加局部扰动，以此来误导深度神经网络，对基于目标检测技术的视觉系统构成了严重威胁。针对这一问题，利用对抗补丁和图像背景的语义差异性，提出了一种基于PatchTracker的对抗补丁防御算法，该算法由上游补丁检测器与下游数据增强模块组成。上游补丁检测器使用带有注意力机制的YOLOV5（you only look once-v5）确定对抗补丁所在位置，有助于提高对小尺度对抗补丁的检测精度；将检测区域用合适的像素值覆盖以抹除对抗补丁，上游补丁检测器不仅能够有效降低对抗样本的攻击性，而且不依赖大规模的训练数据；下游数据增强模块通过改进模型训练范式，提高下游目标检测器的鲁棒性；将抹除补丁后的图像输入经过数据增强的下游 YOLOV5 目标检测模型。在公开的 TT100K 交通标志数据集上进行了交叉验证，实验表明，与未采取防御措施相比，所提算法能够有效防御多种类型的通用对抗补丁攻击，在检测对抗补丁图像时的mAP（mean average precision）提高65%左右，有效地改善了小尺度对抗补丁的漏检情况。与现有算法比较，所提算法有效提高了神经网络在检测对抗样本时的准确率。此外，所提算法不涉及下游模型结构的修改，具有良好的兼容性。

关键词: 深度学习安全, 对抗攻击与防御, 对抗补丁, 目标检测

Abstract:

The application of deep neural networks in target detection has been widely adopted in various fields.However, the introduction of adversarial patch attacks, which add local perturbations to images to mislead deep neural networks, poses a significant threat to target detection systems based on vision techniques.To tackle this issue, an adversarial patch defense algorithm based on PatchTracker was proposed, leveraging the semantic differences between adversarial patches and image backgrounds.This algorithm comprised an upstream patch detector and a downstream data enhancement module.The upstream patch detector employed a YOLOV5 (you only look once-v5) model with attention mechanism to determine the locations of adversarial patches, thereby improving the detection accuracy of small-scale adversarial patches.Subsequently, the detected regions were covered with appropriate pixel values to remove the adversarial patches.This module effectively reduced the impact of adversarial examples without relying on extensive training data.The downstream data enhancement module enhanced the robustness of the target detector by modifying the model training paradigm.Finally, the image with removed patches was input into the downstream YOLOV5 target detection model, which had been enhanced through data augmentation.Cross-validation was performed on the public TT100K traffic sign dataset.Experimental results demonstrated that the proposed algorithm effectively defended against various types of generic adversarial patch attacks when compared to situations without defense measures.The algorithm improves the mean average precision (mAP) by approximately 65% when detecting adversarial patch images, effectively reducing the false negative rate of small-scale adversarial patches.Moreover, compared to existing algorithms, this approach significantly enhances the accuracy of neural networks in detecting adversarial samples.Additionally, the method exhibited excellent compatibility as it does not require modification of the downstream model structure.

Key words: deep learning security, adversarial attack and defense, adversarial patch, object detection

中图分类号:

TP393

肖镇杰, 黄诗瑀, 叶锋, 黄丽清, 黄添强. 基于PatchTracker的对抗补丁防御算法[J]. 网络与信息安全学报, 2024, 10(1): 169-180.

Zhenjie XIAO, Shiyu HUANG, Feng YE, Liqing HUANG, Tianqiang HUANG. Adversarial patch defense algorithm based on PatchTracker[J]. Chinese Journal of Network and Information Security, 2024, 10(1): 169-180.

图/表 14

图1

图2

图3

表1

表2

表3

表4

图4

图5

表5

不同防御方法在TT100K上检测结果对比（AP） Table 5 Comparision of detection results of different defence methods on TT100K (AP)"

图像识别	Adam						MIM
图像识别	无防御	LGS	JPEG压缩	Jedi	SAC	PatchTracker	无防御	LGS	JPEG压缩	Jedi	PatchTrackerSAC
p26	33.04%	27.32%	34.01%	24.63%	54.41%	$85 . 30 %$	29.0%	26.50%	26.50%	23.34%	$92 . 42 %$ 51.82%
pl40	4.79%	6.91%	7.98%	42.22%	60.89%	$82 . 41 %$	11.17%	13.83%	13.30%	44.64%	$84 . 88 %$ 57.61%
p11	18.31%	18.87%	19.32%	37.82%	52.36%	$74 . 67 %$	24.14%	21.24%	21.40%	37.41%	$78 . 72 %$ 52.74%
pne	25.35%	22.07%	22.54%	62.73%	74.11%	$88 . 81 %$	12.42%	15.29%	14.82%	62.02%	$87 . 49 %$ 73.97%
i5	11.95%	15.36%	15.84%	56.38%	74.37%	$88 . 22 %$	23.06%	23.06%	23.06%	56.39%	$88 . 26 %$ 73.58%
i4	12.75%	14.2%	13.74%	29.61%	57.17%	$88 . 93 %$	20.38%	14.13%	23.70%	31.65%	$88 . 94 %$ 58.16%
mAP	17.70%	17.46%	18.91%	42.23%	62.22%	$84 . 72 %$	20.03%	19.01%	20.46%	42.58%	$86 . 79 %$ 61.31%

表5

表6

不同防御方法在TS4500上检测结果对比（AP） Table 6 Comparision of detection results of different defence methods on TS4500 (AP)"

图像类别	Adam						PGD
图像类别	无防御	LGS	JPEG	Jedi	SAC	PatchTracker	无防御	LGS	JPEG	Jedi	SAC	PatchTracker
Stop	40.55%	49.19%	43.59%	70.42%	76.81%	$89 . 94 %$	48.72%	48.71%	48.48%	63.56%	77.91%	$90 . 93 %$
School	0.00%	2.84%	0.00%	9.71%	20.54%	$56 . 40 %$	0.18%	0.90%	0.08%	5.15%	21.18%	$55 . 67 %$
Crosswalk	9.90%	23.05%	10.34%	42.08%	70.59%	$82 . 28 %$	13.74%	18.76%	11.25%	22.38%	74.51%	$82 . 20 %$
Crossroad	6.12%	10.22%	6.55%	27.13%	45.53%	$76 . 25 %$	6.74%	7.15%	5.97%	26.23%	36.68%	$86 . 02 %$
NoEntry	13.00%	52.00%	15.00%	55.50%	89.99%	$89 . 50 %$	25.50%	45.00%	28.50%	45.92%	92.99%	$91 . 00 %$
NoPark	19.52%	57.14%	20.77%	41.88%	41.16%	$79 . 75 %$	21.31%	64.24%	24.03%	39.71%	39.49%	$76 . 62 %$
Park	12.69%	13.29%	11.44%	32.84%	73.63%	$94 . 37 %$	12.73%	10.25%	12.18%	42.27%	67.66%	$96 . 39 %$
Slow	15.06%	35.29%	17.09%	36.43%	65.97%	$82 . 64 %$	21.91%	23.14%	22.42%	27.06%	52.01%	$81 . 17 %$
SpeedLimit	4.81%	3.95%	5.26%	40.03%	84.72%	$84 . 44 %$	10.67%	9.89%	10.65%	26.50%	87.02%	$87 . 32 %$
mAP	13.52%	27.44%	14.45%	39.56%	63.22%	$81 . 73 %$	17.94%	25.34%	18.17%	33.19%	61.05%	$83 . 04 %$

表6

表7

在TT100K测试集上的消融实验结果Table 7 Results of ablation study on test TT100K"

组合序号	设置		结果（mAP）
组合序号	BiFormer	cutout	Adam	MIM
①	×	×	59.18	63.57
②	√	×	60.73	64.49
③	×	√	84.28	86.16
④	√	√	$84 . 72$	$86 . 79$

表7

表8

在TS4500测试集上的消融实验结果Table 8 Results of ablation study on test TS4500"

组合序号	设置		结果（mAP）
组合序号	BiFormer	cutout	Adam	PGD
①	×	×	63.65	68.72
②	√	×	67.58	69.21
③	×	√	79.45	82.98
④	√	√	$81 . 73$	$83 . 40$

表8

图6

参考文献 36

[1]	SZEGEDY C , ZAREMBA W , SUTSKEVER I ,et al. Intriguing properties of neural networks[C]// Proceedings of International Conference on Learning Representations (ICLR). 2014.
[2]	GOODFELLOW I J , SHLENS J , SZEGEDY C . Explaining and harnessing adversarial examples[C]// Proceedings of International Conference on Learning Representations (ICLR). 2015.
[3]	ZHU L , WANG X , KE Z ,et al. BiFormer:vision transformer with bi-level routing attention[C]// Proceedings of Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2023: 10323-10333.
[4]	DEVRIES T , TAYLOR G W . Improved regularization of convolutional neural networks with cutout[J]. arXiv preprint arXiv:1708.04552, 2017.
[5]	KURAKIN A , GOODFELLOW I J , BENGIO S . Adversarial examples in the physical world[C]// Proceedings of International Conference on Learning Representations (ICLR). 2017.
[6]	MADRY A , MAKELOV A , SCHMIDT L ,et al. Towards deep learning models resistant to adversarial attacks[C]// Proceedings of International Conference on Learning Representations(ICLR). 2018.
[7]	DONG Y , LIAO F , PANG T ,et al. Boosting adversarial attacks with momentum[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2018: 9185-9193.
[8]	BROWN T B , MANé D , ROY A ,et al. Adversarial patch[J]. arXiv preprint arXiv:1712.09665, 2017.
[9]	KARMON D , ZORAN D , GOLDBERG Y . Lavan:Localized and visible adversarial noise[C]// Proceedings of International Conference on Machine Learning (ICML). 2018: 3903-3911.
[10]	EYKHOLT K , EVTIMOV I , FERNANDES E ,et al. Robust physical-world attacks on deep learning visual classification[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2018: 1625-1634.
[11]	YANG C , KORTYLEWSKI A , XIE C ,et al. Patchattack:A black-box texture-based attack with reinforcement learning[C]// Proceedings of European Conference on Computer Vision (ECCV). 2020: 681-698.
[12]	SUBRAMANYA A , PILLAI V , PIRSIAVASH H . Fooling network interpretation in image classification[C]// Proceedings of the IEEE/CVF International Conference on Computer Vision (CVPR). 2019: 2020-2029.
[13]	SELVARAJU R R , COGSWELL M , DAS A ,et al. Grad-CAM:Visual explanations from deep networks via gradient-based localization[C]// Proceedings of the IEEE International Conference on Computer Vision (ICCV). 2017: 618-626.
[14]	LIU X , YANG H , LIU Z ,et al. Dpatch:An adversarial patch attack on object detectors[C]// 2019. AAAI Workshop on Artificial Intelligence Safety (SafeAI), 2019.
[15]	LEE M , KOLTER Z . On physical adversarial patches for object detection[J]. arXiv preprint arXiv:1906.11897, 2019.
[16]	WU S , DAI T , XIA S T . DPAttack:Diffused patch attacks against universal object detection[J]. arXiv preprint arXiv:2010.11679, 2020.
[17]	HUANG H , WANG Y , CHEN Z ,et al. RPAttack:Refined patch attack on general object detectors[C]// Proceedings of 2021 IEEE International Conference on Multimedia and Expo (ICME). 2021: 1-6.
[18]	ZHAO Y , YAN H , WEI X . Object hider:Adversarial patch attack against object detectors[C]// Proceedings of Conference on Information and Knowledge Management (CIKM). 2020: 28-31.
[19]	SHARIF M , BHAGAVATULA S , BAUER L ,et al. Accessorize to a crime:Real and stealthy attacks on state-of-the-art face recognition[C]// Proceedings of the 2016 ACM Sigsac Conference on Computer and Communications Security. 2016: 1528-1540.
[20]	YANG X , WEI F , ZHANG H ,et al. Design and interpretation of universal adversarial patches in face detection[C]// Proceedings of European Conference on Computer Vision (ECCV). 2020: 174-191.
[21]	THYS S , VAN RANST W , GOEDEMé T . Fooling automated surveillance cameras:adversarial patches to attack person detection[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). 2019: 49-55.
[22]	HU Y C T , KUNG B H , TAN D S ,et al. Naturalistic physical adversarial patch for object detectors[C]// Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 2021: 7848-7857.
[23]	HUANG H , CHEN Z , CHEN H ,et al. T-SEA:Transfer-based self-ensemble attack on object detection[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2023: 20514-20523.
[24]	NASEER M , KHAN S , PORIKLI F . Local gradients smoothing:Defense against localized adversarial attacks[C]// Proceedings of 2019 IEEE Winter Conference on Applications of Computer Vision (WACV). 2019: 1300-1307.
[25]	HAYES J . On visible adversarial perturbations ＆ digital watermarking[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). 2018: 1597-1604.
[26]	CHOU E , TRAMER F , PELLEGRINO G . SentiNet:Detecting localized universal attacks against deep learning systems[C]// Proceedings of 2020 IEEE Security and Privacy Workshops (SPW). 2020: 48-54.
[27]	XIANG C , MAHLOUJIFAR S , MITTAL P . {PatchCleanser}:Certifiably robust defense against adversarial patches for any image classifier[C]// Proceedings of 31st USENIX Security Symposium (USENIX Security 22). 2022: 2065-2082.
[28]	LIU J , LEVINE A , LAU C P ,et al. Segment and complete:Defending object detectors against adversarial patch attacks with robust patch detection[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2022: 14973-14982.
[29]	RAO S , STUTZ D , SCHIELE B . Adversarial training against location-adamized adversarial patches[C]// Proceedings of European Conference on Computer Vision (ECCV). 2020: 429-448.
[30]	JI N , FENG Y F , XIE H ,et al. Adversarial yolo:Defense human detection patch attacks via detecting adversarial patches[J]. arXiv preprint arXiv:2103.08860, 2021.
[31]	ZHANG Z , YUAN B , MCCOYD M ,et al. Clipped bagnet:Defending against sticker attacks with clipped bag-of- features[C]// Proceedings of 2020 IEEE Security and Privacy Workshops (SPW). 2020: 55-61.
[32]	XIANG C , BHAGOJI A N , SEHWAG V ,et al. {PatchGuard}:a provably robust defense against adversarial patches via small receptive fields and masking[C]// Proceedings of 30th USENIX Security Symposium (USENIX Security 21). 2021: 2237-2254.
[33]	KINGMA D P , BA J . Adam:A method for stochastic adamization[C]// Proceedings of International Conference on Learning Representations (ICLR). 2015.
[34]	DZIUGAITE G K , GHAHRAMANI Z , ROY D M . A study of the effect of JPG compression on adversarial images[J]. arXiv preprint arXiv:1608.00853, 2016.
[35]	TARCHOUN B , BEN KHALIFA A , MAHJOUB M A ,et al. Jedi:entropy-based localization and removal of adversarial patches[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023: 4087-4095.
[36]	LIU J , LEVINE A , LAU C P ,et al. Segment and complete:defending object detectors against adversarial patch attacks with robust patch detection[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022: 14973-14982.

图像类别	补丁生成方法
图像类别	Adam	MIM
p26	85.30%	92.42%
pl40	82.41%	84.88%
p11	74.67%	78.72%
pne	88.81%	87.49%
i5	88.22%	88.26%
i4	88.93%	88.94%
mAP	84.72%	86.79%

图像类别	补丁生成方法
图像类别	Adam	PGD
Stop	89.94%	90.93%
School	56.40%	55.67%
Crosswalk	82.28%	82.20%
Crossroad	76.25%	86.02%
NoEntry	89.50%	91.00%
NoPark	79.75%	76.62%
Park	94.37%	96.39%
Slow	82.64%	81.17%
SpeedLimit	84.44%	87.32%
mAP	81.73%	83.04%

基于PatchTracker的对抗补丁防御算法

Adversarial patch defense algorithm based on PatchTracker

在线阅读

pdf下载

可视化

摘要/Abstract

引用本文

使用本文

图/表 14

参考文献 36

相关文章 2

Metrics

推荐阅读 0

图像类别	攻击方法
图像类别	无攻击	Adam	MIM
p26	94.22%	33.04%	29.00%
pl40	97.91%	4.79%	11.17%
p11	97.06%	18.31%	24.14%
pne	97.64%	25.35%	12.42%
i5	98.53%	11.95%	23.06%
i4	98.27%	12.75%	20.38%
mAP	97.27%	17.70%	20.03%

[1]	陈先意, 顾军, 颜凯, 江栋, 许林峰, 付章杰. 针对车牌识别系统的双重对抗攻击[J]. 网络与信息安全学报, 2023, 9(3): 16-27.
[2]	刘西蒙,谢乐辉,王耀鹏,李旭如. 深度学习中的对抗攻击与防御[J]. 网络与信息安全学报, 2020, 6(5): 36-53.