融合小波快捷连接生成对抗网络的面部性别伪造

doi:10.11959/j.issn.2096-109x.2023046

网络与信息安全学报 ›› 2023, Vol. 9 ›› Issue (3): 150-160.doi: 10.11959/j.issn.2096-109x.2023046

融合小波快捷连接生成对抗网络的面部性别伪造

陈万泽, 黄丽清, 陈家祯, 叶锋, 黄添强, 罗海峰

福建师范大学计算机与网络空间安全学院，福建福州 350117

修回日期:2023-03-02 出版日期:2023-06-25 发布日期:2023-06-01
作者简介:陈万泽（1995- ），男，甘肃兰州人，福建师范大学硕士生，主要研究方向为深度学习、生成对抗网络、图像生成
黄丽清（1991- ），女，福建莆田人，福建师范大学讲师，主要研究方向为视频图像超分辨率、多媒体内容安全
陈家祯（1971- ），女，福建福州人，福建师范大学副教授，主要研究方向为信息安全
叶锋（1978- ），男，福建福州人，福建师范大学教授，主要研究方向为视频通信、视频图像处理
黄添强（1971- ），男，福建仙游人，福建师范大学教授、博士生导师，主要研究方向为人工智能、大数据分析、数字媒体内容安全理论和技术
罗海峰（1990- ），男，福建龙岩人，福建师范大学讲师，主要研究方向为二维/三维计算机视觉
基金资助:
国家自然科学基金(62072106);福建省自然科学基金(2020J01168);福建省教育厅（A类）(JAT210053)

Gender forgery of faces by fusing wavelet shortcut connection generative adversarial network

Wanze CHEN, Liqing HUANG, Jiazhen CHEN, Feng YE, Tianqiang HUANG, Haifeng LUO

College of Computer and Cyber Security, Fujian Normal University, FuZhou 350117, China

Revised:2023-03-02 Online:2023-06-25 Published:2023-06-01
Supported by:
The National Natural Science Foundation of China(62072106);Fujian Province Natural Science Foundation(2020J01168);Fujian Provincial Department of Education (Rank A)(JAT210053)

摘要/Abstract

摘要：

面部属性编辑领域的主流方法由于数据以及模型架构的限制，存在如下两方面的问题：自编码器模型在编码以及解码阶段的瓶颈结构会丢失特征信息，以及在解码过程中逐阶段对源域特征进行连续的样式注入，使得生成图像中目标域信息占比过大，导致生成图像丢失了源域的身份信息以及部分细节；在人脸图像中，图像人物的性别、种族或年龄等属性的差异会使得不同图像的频域信息组成差别较大，在无监督的训练前提下，当前主流网络框架并不能在样式注入阶段调整源域信息以及目标域信息之间的占比，导致生成图像仍然存在伪影。针对上述问题，基于生成对抗网络和图像到图像的翻译提出一种面部性别伪造模型，即融合快捷小波连接生成对抗网络（WscGAN）。通过对自编码器结构添加快捷连接，将不同编码阶段的输出通过小波变换进行特征级别的分解，再引入通道注意力机制将其进行逐个处理，以此动态改变在解码过程中不同频率的源域特征信息的比例，最终实现面部图像在性别属性上的伪造。为了验证所提模型的有效性，WscGAN 分别在 CelebA-HQ 数据集和 FFHQ 数据集上进行了实验。实验结果表明，WscGAN 在 CelebA-HQ 以及 FFHQ 数据集上均领先现有先进模型，其中弗雷歇起始距离分别提升5.4%和19.8%，特征度量相似度分别提升1.8%和4.1%。此外，定性的视觉对比结果充分体现了WscGAN可有效改善面部图像性别属性伪造的效果。

关键词: 图像生成, 生成对抗网络, 图像到图像的翻译, 面部属性编辑, 小波变换

Abstract:

The mainstream methods in the field of facial attribute manipulation had the following two defects due to data and model architecture limitations.First, the bottleneck structure of the autoencoder model results in the loss of feature information, and the traditional method of continuously injected styles to the source domain features during the decoding process makes the generated image too referential to the target domain while losing the identity information and fine-grained details.Second, differences in facial attributes composition between images, such as gender, ethnicity, or age can cause variations in frequency domain information.And the current unsupervised training methods do not automatically adjust the proportion of source and target domain information in the style injection stage, resulting in artifacts in generated images.A facial gender forgery model based on generative adversarial networks and image-to-image translation techniques, namely fused wavelet shortcut connection generative adversarial network (WscGAN), was proposed to address the these issues.Shortcut connections were added to the autoencoder structure, and the outputs of different encoding stages were decomposed at the feature level by wavelet transform.Attention mechanism was employed to process them one by one, to dynamically change the proportion of source domain features at different frequencies in the decoding process.This model could complete forgery of facial images in terms of gender attributes.To verify the effectiveness of the model, it was conducted on the CelebA-HQ dataset and the FFHQ dataset.Compared with the existing optimal models, the method improves the FID and LPIPS indices by 5.4% and 11.2%, and by 1.8% and 6.7%, respectively.Furthermore, the effectiveness of the proposed method in improving the gender attribute conversion of facial images is fully demonstrated by the results based on qualitative visual comparisons.

Key words: image generation, generative adversarial network, image-to-image translation, facial attribute manipulation, wavelet transform

中图分类号:

TP37
TP391

陈万泽, 黄丽清, 陈家祯, 叶锋, 黄添强, 罗海峰. 融合小波快捷连接生成对抗网络的面部性别伪造[J]. 网络与信息安全学报, 2023, 9(3): 150-160.

Wanze CHEN, Liqing HUANG, Jiazhen CHEN, Feng YE, Tianqiang HUANG, Haifeng LUO. Gender forgery of faces by fusing wavelet shortcut connection generative adversarial network[J]. Chinese Journal of Network and Information Security, 2023, 9(3): 150-160.

图/表 8

图1

图2

图3

图4

图5

表1

对不同模型基于潜在引导生成的图像的定量比较Table 1 Quantitative comparison of images generated by different methods based on latent guidance"

模型	CelebA-HQ		FFHQ
模型	FID	LPIPS	FID	LPIPS
MUNIT^[26]	24.96	0.252	28.03	0.231
DRIT++^[27]	52.13	0.178	44.12	0.061
StarGAN-v2^[39]	13.71	0.452	19.21	0.455
PHStarGAN-v2 ^[29]	16.63	0.332	43.03	0.342
本文模型	$12 . 26$	$0 . 462$	$14 . 95$	$0 . 480$

表1

表2

对不同模型基于参考引导生成的图像的定量比较Table 2 Quantitative comparison of images generated by different methods based on reference guidance"

模型	CelebA-HQ		FFHQ
模型	FID	LPIPS	FID	LPIPS
MUNIT^[26]	64.54	0.123	51.29	0.126
DRIT++^[27]	53.3	0.311	40.08	0.032
StarGAN-v2^[39]	23.84	0.388	28.94	0.409
PHStarGAN-v2 ^[29]	28.11	0.290	56.82	0.332
本文模型	$22 . 06$	$0 . 393$	$23 . 90$	$0 . 420$

表2

表3

消融实验结果Table 3 Results of ablation experiments"

不同方法	潜在引导生成		参考引导生成
不同方法	FID	LPIPS	FID	LPIPS
自编码器	22.64	0.092	35.81	0.238
自编码器+快捷连接	19.98	0.125	$19 . 63$	0.124
自编码器+快捷连接+wsc模块	$13 . 26$	$0 . 462$	22.06	$0 . 393$

表3

参考文献 41

[1]	GOODFELLOW I , POUGET-ABADIE J , MIRZA M ,et al. Generative adversarial nets[C]// Proceedings of the 27th International Conference on Neural Information Processing Systems. 2014: 2672-2680.
[2]	MIRZA M , OSINDERO S . Conditional generative adversarial nets[J]. arXiv Preprint arXiv:1411.1784, 2014.
[3]	TANG H , XU D , YAN Y ,et al. Local class-specific and global image-level generative adversarial networks for semantic-guided scene generation[C]// Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). 2020: 7867-7876.
[4]	CHEN B J , TAN W J , WANG Y T ,et al. Distinguishing between natural and GAN-generated face images by combining global and local features[J]. Chinese Journal of Electronics, 2022,31(1): 59-67.
[5]	GUO X , WANG Z , YANG Q ,et al. Gan-based virtual-to-real image translation for urban scene semantic segmentation[J]. Neurocomputing, 2020,394: 127-135.
[6]	KIM T , CHA M , KIM H ,et al. Learning to discover cross-domain relations with generative adversarial networks[J]. arXiv Preprint arXiv:1703.05192, 2017.
[7]	王旭东, 卫红权, 高超 ,等. 身份保持约束下的人脸图像补全[J]. 网络与信息安全学报, 2018,4(8): 71-76.
	WANG X D , WEI H Q , GAO C ,et al. Identity preserving face completion with generative adversarial networks[J]. Chinese Journal of Network and Information Security, 2018,4(8): 71-76.
[8]	TUNG H Y F , HARLEY A W , SETO W ,et al. Adversarial inverse graphics networks:learning 2D-to-3D lifting and image-to-image translation from unpaired supervision[C]// Proceedings of 2017 IEEE International Conference on Computer Vision (ICCV). 2017: 4364-4372.
[9]	LI X Y , ZHANG S C , HU J ,et al. Image-to-image translation via hierarchical style disentanglement[C]// Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2021: 8635-8644.
[10]	ZHU J Y , PARK T , ISOLA P ,et al. Unpaired image-to-image translation using cycle-consistent adversarial networks[C]// Proceedings of 2017 IEEE International Conference on Computer Vision (ICCV). 2017: 2242-2251.
[11]	TOMEI M , CORNIA M , BARALDI L ,et al. Art2Real:unfolding the reality of artworks via semantically-aware image-to-image translation[C]// Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2020: 5842-5852.
[12]	ISOLA P , ZHU J Y , ZHOU T H ,et al. Image-to-image translation with conditional adversarial networks[C]// Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2017: 5967-5976.
[13]	KINGMA D P , WELLING M . Auto-encoding variational bayes[J]. arXiv Preprint arXiv:1312.6114, 2013.
[14]	MAKHZANI A , SHLENS J , JAITLY N ,et al. Adversarial autoencoders"[J]. arXiv Preprint arXiv:1511.05644, 2015.
[15]	KARRAS T , AILA , LAINE S ,et al. Progressive growing of GANs for improved quality,stability,and variation[J]. arXiv preprint arXiv:1710.10196, 2017.
[16]	KARRAS T , LAINE S , AILA T M . A style-based generator architecture for generative adversarial networks[C]// Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2020: 4396-4405.
[17]	DENTON E L , CHINTALA S , FERGUS R . Deep generative image models using a laplacian pyramid of adversarial networks[J]. arXiv preprint arXiv:1506.05751, 2015.
[18]	MAO X D , LI Q , XIE H R ,et al. Least Squares generative adversarial networks[C]// Proceedings of 2017 IEEE International Conference on Computer Vision (ICCV). 2017: 2813-2821.
[19]	ARJOVSKY M , CHINTALA S , BOTTOU L . Wasserstein GAN[J]. arXiv Preprint arXiv:1701.07875, 2017.
[20]	MIYATO T , KATAOKA T , KOYAMA M ,et al. Spectral normalization for generative adversarial networks[J]. arXiv Preprint arXiv:1802.05957, 2018.
[21]	RADFORD A , METZ L , CHINTALA S . Unsupervised representation learning with deep convolutional generative adversarial networks[J]. arXiv Preprint arXiv:1511.06434, 2015.
[22]	CHEN X , DUAN Y , HOUTHOOFT R ,et al. InfoGAN:interpretable representation learning by information maximizing generative adversarial nets[C]// Proceedings of Neural Information Processing Systems. 2016: 2172-2180.
[23]	BROCK A , DONAHUE J , SIMONYAN K . Large scale GAN training for high fidelity natural image synthesis[J]. arXiv Preprint arXiv:1809.11096, 2018.
[24]	TANG H , XU D , SEBE N ,et al. Multi-channel attention selection GAN with cascaded semantic guidance for cross-view image translation[C]// Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2020: 2412-2421.
[25]	PARK T , LIU M Y , WANG T C ,et al. Semantic image synthesis with spatially-adaptive normalization[C]// Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). 2020: 2332-2341.
[26]	HUANG X , LIU M Y , BELONGIE S ,et al. Multimodal unsupervised image-to-image translation[C]// Proceedings of European Conference on Computer Vision. 2018: 179-196.
[27]	LEE H Y , TSENG H Y , MAO Q ,et al. DRIT++:diverse image-to-image translation via disentangled representations[J]. arXiv Preprint arXiv:1905.01270, 2019.
[28]	CHOI Y , CHOI M , KIM M ,et al. StarGAN:unified generative adversarial networks for multi-domain image-to-image translation[C]// Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018: 8789-8797.
[29]	GRASSUCCI E , SIGILLO L , UNCINI A ,et al. Hyper complex image-to-image translation[J]. arXiv Preprint arXiv:2205.02087, 2022.
[30]	DEMIREL H , ANBARJAFARI G . Discrete wavelet transform-based satellite image resolution enhancement[J]. IEEE Transactions on Geoscience and Remote Sensing, 2011,49(6): 1997-2004.
[31]	CHEN T S , LIN L , ZUO W M ,et al. Learning a wavelet-like auto-encoder to accelerate deep neural networks[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2018,32(1).
[32]	HUANG H B , HE R , SUN Z N ,et al. Wavelet-SRNet:a wavelet-based CNN for multi-scale face super resolution[C]// Proceedings of 2017 IEEE International Conference on Computer Vision (ICCV). 2017: 1698-1706.
[33]	FUJIEDA S , TAKAYAMA K , HACHISUKA T . Wavelet convolutional neural networks for texture classification"[J]. arXiv Preprint arXiv:1707.07394, 2017.
[34]	WILLIAMS T , LI R . Wavelet pooling for convolutional neural networks[C]// Proceedings of International Conference on Learning Representations. 2018.
[35]	RONNEBERGER O , FISCHER P , BROX T . U-Net:convolutional networks for biomedical image segmentation[C]// Proceedings of International Conference on Medical Image Computing and Computer-Assisted Intervention. 2015: 234-241.
[36]	HUANG X , BELONGIE S . Arbitrary style transfer in real-time with adaptive instance normalization[C]// Proceedings of 2017 IEEE International Conference on Computer Vision (ICCV). 2017: 1510-1519.
[37]	YOO J , UH Y , CHUN S ,et al. Photorealistic style transfer via wavelet transforms[C]// Proceedings of 2019 IEEE/CVF International Conference on Computer Vision (ICCV). 2020: 9035-9044.
[38]	MAO Q , LEE H Y , TSENG H Y ,et al. Mode seeking generative adversarial networks for diverse image synthesis[C]// Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2020: 1429-1437.
[39]	CHOI Y , UH Y , YOO J ,et al. StarGAN v2:diverse image synthesis for multiple domains[C]// Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2020: 8185-8194.
[40]	HEUSEL M , RAMSAUER H , UNTERTHINER T ,et al. GANs trained by a two time-scale update rule converge to a local nash equilibrium[C]// Proceedings of Neural Information Processing Systems (NIPS). 2017: 6626-6637.
[41]	ZHANG R , ISOLA P , EFROS A A ,et al. The unreasonable effectiveness of deep features as a perceptual metric[C]// Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018: 586-595.

融合小波快捷连接生成对抗网络的面部性别伪造

Gender forgery of faces by fusing wavelet shortcut connection generative adversarial network

在线阅读

pdf下载

可视化

摘要/Abstract

引用本文

使用本文

图/表 8

参考文献 41

相关文章 7

Metrics

推荐阅读 0

[1]	余锋, 林庆新, 林晖, 汪晓丁. 基于生成对抗网络的隐私增强联邦学习方案[J]. 网络与信息安全学报, 2023, 9(3): 113-122.
[2]	杨宇光, 曹国栋. 新的基于双混沌系统和压缩感知的图像加密算法[J]. 网络与信息安全学报, 2022, 8(5): 88-97.
[3]	王正龙, 张保稳. 生成对抗网络研究综述[J]. 网络与信息安全学报, 2021, 7(4): 68-85.
[4]	张煜,吕锡香,邹宇聪,李一戈. 基于生成对抗网络的文本序列数据集脱敏[J]. 网络与信息安全学报, 2020, 6(4): 109-119.
[5]	郭鹏, 钟尚平, 陈开志, 程航. 差分隐私GAN梯度裁剪阈值的自适应选取方法[J]. 网络与信息安全学报, 2018, 4(5): 10-20.
[6]	李巧玲,关晴骁,赵险峰. 基于卷积神经网络的图像生成方式分类方法[J]. 网络与信息安全学报, 2016, 2(9): 40-48.
[7]	王波,殷建峰,李亚宾. JPEG压缩对相机型号来源取证的影响分析[J]. 网络与信息安全学报, 2016, 2(9): 65-71.