网络与信息安全学报 ›› 2022, Vol. 8 ›› Issue (6): 146-155.doi: 10.11959/j.issn.2096-109x.2022075

• 学术论文 • 上一篇    下一篇

空域频域相结合的唇型篡改检测方法

林佳滢1,2, 周文柏1,2, 张卫明1,2, 俞能海1,2   

  1. 1 中国科学院电磁空间信息重点实验室,安徽 合肥 230027
    2 中国科学技术大学网络空间安全学院,安徽 合肥 230027
  • 修回日期:2022-07-09 出版日期:2022-12-15 发布日期:2023-01-16
  • 作者简介:林佳滢(1997- ),女,江西赣州人,中国科学技术大学硕士生,主要研究方向为人工智能安全、信息隐藏
    周文柏(1992- ),男,安徽合肥人,中国科学技术大学特任副研究员,主要研究方向为信息隐藏、人工智能安全
    张卫明(1976- ),男,河北定州人,中国科学技术大学教授、博士生导师,主要研究方向为信息隐藏、多媒体内容安全、人工智能安全
    俞能海(1964- ),男,安徽无为人,中国科学技术大学教授、博士生导师,主要研究方向为多媒体信息检索、图像处理与视频通信、数字媒体内容安全
  • 基金资助:
    国家自然科学基金(U20B2047);国家自然科学基金(62072421);国家自然科学基金(62002334);国家自然科学基金(62102386);国家自然科学基金(62121002);中国科技大学探索基金项目(YD3480002001);中央高校基础研究基金(WK2100000011)

Lip forgery detection via spatial-frequency domain combination

Jiaying LIN1,2, Wenbo ZHOU1,2, Weiming ZHANG1,2, Nenghai YU1,2   

  1. 1 Key Laboratory of Electromagnetic Space Information, Chinese Academy of Sciences, Hefei 230027, China
    2 School of Cyber Science, University of Science and Technology of China, Hefei 230027, China
  • Revised:2022-07-09 Online:2022-12-15 Published:2023-01-16
  • Supported by:
    The National Natural Science Foundation of China(U20B2047);The National Natural Science Foundation of China(62072421);The National Natural Science Foundation of China(62002334);The National Natural Science Foundation of China(62102386);The National Natural Science Foundation of China(62121002);Exploration Fund Project of University of Science and Technology of China(YD3480002001);Fundamental Research Funds for the Central Universities(WK2100000011)

摘要:

近年来,社交网络中的“换脸”视频层出不穷,对说话者进行唇型篡改是其中的视频代表之一,这给大众生活增添娱乐的同时,对于网络空间中的个人隐私、财产安全也带来了不小隐患。大多数唇型篡改检测方法在无损条件下取得了较好的表现,但广泛存在于社交媒体平台、人脸识别等场景中的压缩操作,在节约像素和时间冗余的同时,会对视频质量造成影响,破坏空域上像素与像素、帧与帧之间的连贯完整性,导致其检测性能的下降,从而引发对真实视频的错判情况。当空域信息无法提供足够有效的特征时,能够抵抗压缩干扰的频域信息就自然而然地成为重点研究对象。针对这一问题,通过分析频率信息在图像结构和梯度反馈上的优势,提出了空域频域相结合的唇型篡改检测方法,有效利用空域、频域信息的各自特点。对于空域上的唇型特征,设计了自适应提取网络和轻量级的注意力模块;对于频域上的频率特征,设计了不同分量的分离提取与融合模块。随后,通过对空域上的唇型特征和频域上的频率特征进行有侧重的融合,保留更多关键纹理信息。此外,在训练中设计细粒度约束,分开真假唇型特征类间距离的同时,拉近类内距离。实验结果表明,得益于频率信息,所提方法能有效改善压缩情况下的检测准确性,并具备一定的迁移性。另外,在对核心模块开展的消融实验中,相关结果验证了频率分量对于抗压缩的有效性,以及双重损失函数在训练中的约束作用。

关键词: 人脸伪造, 人脸伪造检测防御, 唇型篡改检测, 抗压缩, 深度学习

Abstract:

In recent years, numerous “face-swapping” videos have emerged in social networks, one of the representatives is the lip forgery with speakers.While making life more entertaining for the public, it poses a significant crisis for personal privacy and property security in cyberspace.Currently, under non-destructive conditions, most of the lip forgery detection methods achieve good performance.However, the compression operations are widely used in practice especially in social media platforms, face recognition and other scenarios.While saving pixel and time redundancy, the compression operations affect the video quality and destroy the coherent integrity of pixel-to-pixel and frame-to-frame in the spatial domain, and then the degradation of its detection performance and even misjudgment of the real video will be caused.When the information in the spatial domain cannot provide sufficiently effective features, the information in the frequency domain naturally becomes a priority research object because it can resist compression interference.Aiming at this problem, the advantages of frequency information in image structure and gradient feedback were analyzed.Then the lip forgery detection via spatial-frequency domain combination was proposed, which effectively utilized the corresponding characteristics of information in spatial and frequency domains.For lip features in the spatial domain, an adaptive extraction network and a light-weight attention module were designed.For frequency features in the frequency domain, separate extraction and fusion modules for different components were designed.Subsequently, by conducting a weighted fusion of lip features in spatial domain and frequency features in frequency domain, more texture information was preserved.In addition, fine-grained constraints were designed during the training to separate the inter-class distance of real and fake lip features while closing the intra-class distance.Experimental results show that, benefiting from the frequency information, the proposed method can enhance the detection accuracy under compression situation with certain transferability.On the other hand, in the ablation study conducted on the core modules, the results verify the effectiveness of the frequency component for anti-compression and the constraint of the dual loss function in training.

Key words: DeepFake forgery, DeepFake detection and defense, lip forgery detection, anti-compression, deep learning

中图分类号: 

No Suggested Reading articles found!