HRDA-Net：面向真实场景的图像多篡改检测与定位算法

doi:10.11959/j.issn.1000-436x.2022016

通信学报 ›› 2022, Vol. 43 ›› Issue (1): 217-226.doi: 10.11959/j.issn.1000-436x.2022016

• 学术通信 • 上一篇

HRDA-Net：面向真实场景的图像多篡改检测与定位算法

朱叶¹^,², 余宜林¹, 郭迎春¹

¹ 河北工业大学人工智能与数据科学学院，天津 300401
² 深圳市媒体信息内容安全重点实验室，广东深圳 518060

修回日期:2021-12-22 出版日期:2022-01-25 发布日期:2022-01-01
作者简介:朱叶（1989- ），女，山东菏泽人，博士，河北工业大学讲师、硕士生导师，主要研究方向为图像安全取证、图像处理与模式识别
余宜林（1998- ），男，福建南平人，河北工业大学硕士生，主要研究方向为图像安全取证
郭迎春（1970- ），女，河北张家口人，博士，河北工业大学副教授、硕士生导师，主要研究方向为图像处理与模式识别、人工智能等
基金资助:
国家自然科学基金资助项目(62102129);国家自然科学基金资助项目(61806071);国家自然科学基金资助项目(91746207);河北省自然科学基金资助项目(F2021202030);河北省自然科学基金资助项目(F2020202025);河北省自然科学基金资助项目(F2019202381);河北省自然科学基金资助项目(F2019202464);河北省高等学校科学技术研究基金资助项目(QN2019207);河北省高等学校科学技术研究基金资助项目(QN2020185)

HRDA-Net: image multiple manipulation detection and location algorithm in real scene

Ye ZHU¹^,², Yilin YU¹, Yingchun GUO¹

¹ School of Artificial Intelligence, Hebei University of Technology, Tianjin 300401, China
² Shenzhen Key Laboratory of Media Security, Shenzhen 518060, China

Revised:2021-12-22 Online:2022-01-25 Published:2022-01-01
Supported by:
The National Natural Science Foundation of China(62102129);The National Natural Science Foundation of China(61806071);The National Natural Science Foundation of China(91746207);The Natural Science Foundation of Hebei Province(F2021202030);The Natural Science Foundation of Hebei Province(F2020202025);The Natural Science Foundation of Hebei Province(F2019202381);The Natural Science Foundation of Hebei Province(F2019202464);The Sci-Tech Research Projects of Higher Education of Hebei Province(QN2019207);The Sci-Tech Research Projects of Higher Education of Hebei Province(QN2020185)

摘要/Abstract

摘要：

针对主流篡改数据集单幅图像仅包含一类篡改操作，且对真实图像定位存在“伪影”问题，构建面向真实场景的多篡改数据集（MM Dataset），每幅篡改图像包含拼接和移除2种篡改操作。针对多篡改检测与定位任务，提出端到端的高分辨率扩张卷积注意力网络（HRDA-Net），利用自顶向下扩张卷积注意力（TDDCA）模块融合图像 RGB 域和 SRM 域特征。最后，采用混合扩张卷积模块（MDC）分别提取拼接、移除和篡改检测任务特征，实现篡改区域定位和篡改置信度预测。为提高网络训练效率，提出余弦相似度损失函数作为辅助损失。实验结果表明，在MM Dataset下，与主流语义分割方法相比，HRDA-Net具有较优的性能和较强的稳健性；在单篡改数据集CASIA和NIST下，与主流单篡改定位方法相比，HRDA-Net的F1和AUC分数均较优。

关键词: 深度学习, 多篡改检测与定位, 多篡改数据集, 余弦相似度损失函数

Abstract:

Aiming at the problems that the fake image just contains one tampered operation in mainstream manipulation datasets and the artifact is a common problem in manipulation location.The multiple manipulation dataset (MM Dataset) was constructed for real scene, which contained both splicing and removal in each images.Based on this, an end-to-end high-resolution representation dilation attention network (HRDA-Net) was proposed for multiple manipulation detection and localization, which fused the RGB and SRM features through the top-down dilation convolutional attention (TDDCA).Finally, the mixed dilated convolution (MDC) would respectively extract the features of splicing and removal, which could realize multiple manipulation location and confidence prediction.The cosine similarity loss was proposed as auxiliary loss to improve the efficiency of network.Experimental results on MM Dataset indicate that the performance and robustness of HRDA-Net is better than semantic segmentation methods.Furthermore, the scores of F1 and AUC are greater than state-of-the-art manipulation location methods in CASIA and NIST datasets.

Key words: deep learning, multiple manipulation detection and location, MM Dataset, cosine similarity loss function

中图分类号:

TP391

朱叶, 余宜林, 郭迎春. HRDA-Net：面向真实场景的图像多篡改检测与定位算法[J]. 通信学报, 2022, 43(1): 217-226.

Ye ZHU, Yilin YU, Yingchun GUO. HRDA-Net: image multiple manipulation detection and location algorithm in real scene[J]. Journal on Communications, 2022, 43(1): 217-226.

图/表 13

图1

图2

图3

表1

图4

表2

模型有效性消融实验结果对比"

HRNet	SRM DB	TDDCA	MDC	L_cos-定位	L_cos-检测	拼接-F1	移除-F1	fp	Accuracy
√	×	×	×	√	×	0.804	0.485	0.303	0.766
√	√	×	×	√	×	0.490	0.156	0.365	0.692
√	√	√	×	√	×	0.894	0.558	0.162	0.828
√	√	×	√	√	×	0.745	0.413	0.177	0.775
√	√	√	√	×	×	0.891	0.564	0.152	0.836
√	√	√	√	√	√	0.899	0.576	0.150	0.833
√	√	√	√	√	×	$0 . 899$	$0 . 576$	$0 . 150$	$0 . 850$

表2

表3

训练方法有效性消融实验结果对比"

训练方法	拼接-F1	移除-F1	fp	Accuracy
多篡改定位	0.899	0.576	0.150	—
多篡改检测	—	—	—	0.892
分步训练-参数不冻结	0.537	0.221	0.246	$0 . 913$
分步训练-参数冻结	$0 . 899$	$0 . 576$	$0 . 150$	0.850

表3

表4

图5

表5

HRDA-Net与主流语义分割模型对比实验结果"

模型名称	拼接篡改定位			移除篡改定位
模型名称	precise	recall	F1	precise	recall	F1
FCN	$0 . 962$	0.616	0.636	0.580	0.225	0.305
Deeplabv3	0.831	0.727	0.770	0.760	0.385	0.507
PSPNet	0.808	0.685	0.734	0.581	0.327	0.407
DANet	0.718	0.799	0.751	0.717	0.228	0.344
RRU-Net	0.690	0.793	0.727	0.457	0.224	0.286
HRNet	0.867	0.768	0.804	0.700	0.388	0.485
HRDA-Net	0.888	$0 . 913$	$0 . 899$	$0 . 764$	$0 . 481$	$0 . 576$

表5

图6

表6

CASIA和NIST数据集中单篡改定位对比实验"

模型	F1		AUC
模型	NIST	CASIA	NIST	CASIA
NoI	0.285	0.263	0.487	0.612
CFA	0.174	0.207	0.501	0.522
RGB-N	0.722	0.408	0.937	0.795
LSTM-En	—	0.391	0.793	0.762
GSCNet	0.837	0.471	0.917	$0 . 833$
SEINet	0.891	0.488	0.980	0.801
HRDA-Net	$0 . 951$	$0 . 496$	$0 . 993$	$0 . 833$

表6

图7

参考文献 39

[1]	乔通, 姚宏伟, 潘彬民 ,等. 基于深度学习的数字图像取证技术研究进展[J]. 网络与信息安全学报, 2021,7(5): 13-28.
	QIAO T , YAO H W , PAN B M ,et al. Research progress of digital image forensic techniques based on deep learning[J]. Chinese Journal of Network and Information Security, 2021,7(5): 13-28.
[2]	田秀霞, 李华强, 张琴 ,等. 基于双通道R-FCN的图像篡改检测模型[J]. 计算机学报, 2021,44(2): 370-383.
	TIAN X X , LI H Q , ZHANG Q ,et al. Dual-channel R-FCN model for image forgery detection[J]. Chinese Journal of Computers, 2021,44(2): 370-383.
[3]	张旭, 胡晰远, 陈晨 ,等. 基于透视投影下空间光照一致性分析的图像拼接篡改检测[J]. 自动化学报, 2019,45(10): 1857-1869.
	ZHANG X , HU X Y , CHEN C ,et al. Image splicing detection based on spatial lighting consistency analysis under perspective projection[J]. Acta Automatica Sinica, 2019,45(10): 1857-1869.
[4]	RAO Y , NI J Q . A deep learning approach to detection of splicing and copy-move forgeries in images[C]// Proceedings of 2016 IEEE International Workshop on Information Forensics and Security (WIFS). Piscataway:IEEE Press, 2016: 1-6.
[5]	ROTA P , SANGINETO E , CONOTTER V ,et al. Bad teacher or unruly student:can deep learning say something in Image Forensics analysis?[C]// Proceedings of 2016 23rd International Conference on Pattern Recognition (ICPR). Piscataway:IEEE Press, 2016: 2503-2508.
[6]	LIU B , PUN C M . Locating splicing forgery by fully convolutional networks and conditional random field[J]. Signal Processing:Image Communication, 2018,66: 103-112.
[7]	BI X L , WEI Y , XIAO B ,et al. RRU-net:the ringed residual U-net for image splicing forgery detection[C]// Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Piscataway:IEEE Press, 2019: 1-10.
[8]	王珠珠 . 基于 U 型检测网络的图像篡改检测算法[J]. 通信学报, 2019,40(4): 171-178.
	WANG Z Z . Image forgery detection algorithm based on U-shaped detection network[J]. Journal on Communications, 2019,40(4): 171-178.
[9]	CHEN B J , TAN W J , COATRIEUX G ,et al. A serial image copy-move forgery localization scheme with source/target distinguishment[J]. IEEE Transactions on Multimedia, 2021,23: 3506-3517.
[10]	ZHU Y , CHEN C F , YAN G ,et al. AR-net:adaptive attention and residual refinement network for copy-move forgery detection[J]. IEEE Transactions on Industrial Informatics, 2020,16(10): 6714-6723.
[11]	LI H D , HUANG J W . Localization of deep inpainting using high-pass fully convolutional network[C]// Proceedings of 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Piscataway:IEEE Press, 2019: 8301-8310.
[12]	LIU Y Q , GUAN Q X , ZHAO X F ,et al. Image forgery localization based on multi-scale convolutional neural networks[C]// Proceedings of the 6th ACM Workshop on Information Hiding and Multimedia Security. New York:ACM Press, 2018: 85-90.
[13]	BAPPY J H , SIMONS C , NATARAJ L ,et al. Hybrid LSTM and encoder-decoder architecture for detection of image forgeries[J]. IEEE Transactions on Image Processing:a Publication of the IEEE Signal Processing Society, 2019,28(7): 3286-3300.
[14]	ZHOU P , HAN X T , MORARIU V I ,et al. Learning rich features for image manipulation detection[C]// Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2018: 1053-1061.
[15]	WU Y , ABDALMAGEED W , NATARAJAN P . ManTra-net:manipulation tracing network for detection and localization of image forgeries with anomalous features[C]// Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway:IEEE Press, 2019: 9543-9552.
[16]	HU X F , ZHANG Z H , JIANG Z Y ,et al. SPAN:spatial pyramid attention network for image manipulation localization[C]// Proceedings of the European Conference on Computer Vision (ECCV). Berlin:Springer, 2020: 312-328.
[17]	SUN K , XIAO B , LIU D ,et al. Deep high-resolution representation learning for human pose estimation[C]// Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway:IEEE Press, 2019: 5693-5703.
[18]	DONG J , WANG W , TAN T N . CASIA image tampering detection evaluation database[C]// Proceedings of 2013 IEEE China Summit and International Conference on Signal and Information Processing. Piscataway:IEEE Press, 2013: 422-426.
[19]	GUAN H Y , KOZAK M , ROBERTSON E ,et al. MFC datasets:large-scale benchmark datasets for media forensic challenge evaluation[C]// Proceedings of 2019 IEEE Winter Applications of Computer Vision Workshops (WACVW). Piscataway:IEEE Press, 2019: 63-72.
[20]	WEN B H , ZHU Y , SUBRAMANIAN R ,et al. COVERAGE—a novel database for copy-move forgery detection[C]// Proceedings of 2016 IEEE International Conference on Image Processing (ICIP). Piscataway:IEEE Press, 2016: 161-165.
[21]	MAHFOUDI G , TAJINI B , RETRAINT F ,et al. DEFACTO:image and face manipulation dataset[C]// Proceedings of 2019 27th European Signal Processing Conference (EUSIPCO). Piscataway:IEEE Press, 2019: 1-5.
[22]	LIN T Y , MAIRE M , BELONGIE S ,et al. Microsoft COCO:common objects in context[C]// European Conference on Computer Vision. Berlin:Springer, 2014: 740-755.
[23]	HU J , SHEN L , SUN G . Squeeze-and-excitation networks[C]// Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2018: 7132-7141.
[24]	WOO S , PARK J , LEE J Y ,et al. CBAM:convolutional block attention module[C]// European Conference on Computer Vision. Berlin:Springer, 2018: 3-19.
[25]	ZHANG Q L , YANG Y B . SA-net:shuffle attention for deep convolutional neural networks[C]// Proceedings of 2021 IEEE International Conference on Acoustics,Speech and Signal Processing (ICASSP). Piscataway:IEEE Press, 2021: 2235-2239.
[26]	PANG B , LI Y Z , LI J F ,et al. TDAF:top-down attention framework for vision tasks[J]. arXiv Preprint,arXiv:2012.07248, 2020.
[27]	SHORE J , JOHNSON R . Axiomatic derivation of the principle of maximum entropy and the principle of minimum cross-entropy[J]. IEEE Transactions on Information Theory, 1980,26(1): 26-37.
[28]	LIN T Y , GOYAL P , GIRSHICK R ,et al. Focal loss for dense object detection[C]// Proceedings of the IEEE International Conference on Computer Vision. Piscataway:IEEE Press, 2017: 2980-2988.
[29]	WANG H , WANG Y T , ZHOU Z ,et al. CosFace:large margin cosine loss for deep face recognition[C]// Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2018: 5265-5274.
[30]	HW A , CL A , ML A ,et al. Optimized HRNet for image semantic segmentation[J]. Expert Systems with Applications, 2020,174:114532.
[31]	HUANG G , LIU Z , VAN DER MAATEN L ,et al. Densely connected convolutional networks[C]// Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway:IEEE Press, 2017: 4700-4708.
[32]	LONG J , SHELHAMER E , DARRELL T . Fully convolutional networks for semantic segmentation[C]// Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway:IEEE Press, 2015: 3431-3440.
[33]	CHEN L C , PAPANDREOU G , SCHROFF F ,et al. Rethinking atrous convolution for semantic image segmentation[J]. arXiv Preprint,arXiv:1706.05587, 2017.
[34]	ZHAO H S , SHI J P , QI X J ,et al. Pyramid scene parsing network[C]// Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway:IEEE Press, 2017: 2881-2890.
[35]	FU J , LIU J , TIAN H J ,et al. Dual attention network for scene segmentation[C]// Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway:IEEE Press, 2019: 3146-3154.
[36]	MAHDIAN B , SAIC S . Using noise inconsistencies for blind image forensics[J]. Image and Vision Computing, 2009,27(10): 1497-1503.
[37]	FERRARA P , BIANCHI T , ROSA A D ,et al. Image forgery localization via fine-grained analysis of CFA artifacts[J]. IEEE Transactions on Information Forensics and Security, 2012,7(5): 1566-1577.
[38]	SHI Z N , SHEN X J , CHEN H P ,et al. Global semantic consistency network for image manipulation detection[J]. IEEE Signal Processing Letters, 2020,27: 1755-1759.
[39]	ZHU Y , QI N , GUO Y ,et al. SEINet:semantic-edge interaction network for image manipulation localization[C]// Proceedings of the fourth Chinese Conference on Pattern Recognition and Computer Vision.[S.l.:s.n.], 2021: 1-13.

数据集	篡改图像数量/幅	用途
自制拼接数据集	15 000	预训练
MM Dataset	800/200	微调/测试
CASIA v2.0	5 059	微调
CASIA v1.0	908	测试
NIST	404/160	微调/测试

后处理手段	参数名称	参数取值
JPEG压缩	压缩因子	{100,90,80,70,60,50}
高斯噪声	噪声参数	{0.06,0.05,0.04,0.03,0.02,0.01}
高斯模糊	半径	{1.0, 1.2,1.4,1.6,1.8,2.0}
亮度	—	{1.0, 1.1, 1.2, 1.3, 1.4, 1.5}
对比度	—	{1.0, 1.1, 1.2, 1.3, 1.4, 1.5}
色彩平衡	—	{1.0, 1.1, 1.2, 1.3, 1.4, 1.5}

HRDA-Net：面向真实场景的图像多篡改检测与定位算法

HRDA-Net: image multiple manipulation detection and location algorithm in real scene

在线阅读

PDF下载

可视化

摘要/Abstract

引用本文

使用本文

图/表 13

参考文献 39

相关文章 15

Metrics

推荐阅读 0

[1]	陈东昱, 陈华, 范丽敏, 付一方, 王舰. 基于深度学习的随机性检验策略研究[J]. 通信学报, 2023, 44(6): 23-33.
[2]	李荣鹏, 汪丙炎, 张宏纲, 赵志峰. 知识增强的语义通信接收端设计[J]. 通信学报, 2023, 44(6): 70-76.
[3]	马帅, 裴科, 祁华艳, 李航, 曹雯, 王洪梅, 熊海良, 李世银. 基于生成模型的地磁室内高精度定位算法研究[J]. 通信学报, 2023, 44(6): 211-222.
[4]	杨洁, 董标, 付雪, 王禹, 桂冠. 基于轻量化分布式学习的自动调制分类方法[J]. 通信学报, 2022, 43(7): 134-142.
[5]	杨秀璋, 彭国军, 李子川, 吕杨琦, 刘思德, 李晨光. 基于Bert和BiLSTM-CRF的APT攻击实体识别及对齐研究[J]. 通信学报, 2022, 43(6): 58-70.
[6]	廖勇, 王世义. 高速移动环境下基于RM-Net的大规模MIMO CSI反馈算法[J]. 通信学报, 2022, 43(5): 166-176.
[7]	廖育荣, 王海宁, 林存宝, 李阳, 方宇强, 倪淑燕. 基于深度学习的光学遥感图像目标检测研究进展[J]. 通信学报, 2022, 43(5): 190-203.
[8]	赵增华, 童跃凡, 崔佳洋. 基于域自适应的Wi-Fi指纹设备无关室内定位模型[J]. 通信学报, 2022, 43(4): 143-153.
[9]	廖勇, 程港, 李玉杰. 基于深度展开的大规模MIMO系统CSI反馈算法[J]. 通信学报, 2022, 43(12): 77-88.
[10]	段雪源, 付钰, 王坤, 李彬. 基于简单统计特征的LDoS攻击检测方法[J]. 通信学报, 2022, 43(11): 53-64.
[11]	霍俊彦, 邱瑞鹏, 马彦卓, 杨付正. 基于最邻近帧质量增强的视频编码参考帧列表优化算法[J]. 通信学报, 2022, 43(11): 136-147.
[12]	康海燕, 冀源蕊. 基于本地化差分隐私的联邦学习方法研究[J]. 通信学报, 2022, 43(10): 94-105.
[13]	张红霞, 王琪, 王登岳, 王奔. 基于深度学习的区块链蜜罐陷阱合约检测[J]. 通信学报, 2022, 43(1): 194-202.
[14]	晏燕, 丛一鸣, Adnan Mahmood, 盛权政. 基于深度学习的位置大数据统计发布与隐私保护方法[J]. 通信学报, 2022, 43(1): 203-216.
[15]	杨晓元, 毕新亮, 刘佳, 黄思远. 结合图像加密与深度学习的高容量图像隐写算法[J]. 通信学报, 2021, 42(9): 96-105.