电信科学 ›› 2023, Vol. 39 ›› Issue (1): 72-78.doi: 10.11959/j.issn.1000-0801.2023005

• 研究与开发 • 上一篇    下一篇

基于中心对称局部二值模式的合成伪装语音检测方法

徐嘉1, 简志华1, 金宏辉1, 吴超1, 游林2, 吴迎笑3   

  1. 1 杭州电子科技大学通信工程学院,浙江 杭州 310018
    2 杭州电子科技大学网络空间安全学院,浙江 杭州 310018
    3 杭州电子科技大学计算机学院,浙江 杭州 310018
  • 修回日期:2022-12-15 出版日期:2023-01-20 发布日期:2023-01-01
  • 作者简介:徐嘉(1998- ),女,杭州电子科技大学通信工程学院硕士生,主要研究方向为伪装语音检测
    简志华(1978- ),男,杭州电子科技大学通信工程学院副教授、硕士生导师,主要研究方向为语音转换、伪装语音检测、声纹识别等
    金宏辉(1999- ),男,杭州电子科技大学通信工程学院硕士生,主要研究方向为语音转换和伪装语音检测
    吴超(1988- ),男,杭州电子科技大学通信工程学院讲师,主要研究方向为导航信号处理及欺骗干扰检测
    游林(1966- ),男,杭州电子科技大学网络空间安全学院教授、博士生导师,主要研究方向为生物信息处理、信息安全、密码学等
    吴迎笑(1980- ),女,杭州电子科技大学计算机学院特聘教授,主要研究方向为毫米波感知用于声纹识别与认证、射频信息处理和工业互联网
  • 基金资助:
    国家自然科学基金资助项目(61201301);国家自然科学基金资助项目(61772166);国家自然科学基金资助项目(61901154)

Synthetic spoofing speech detection method based on center-symmetric local binary pattern

Jia XU1, Zhihua JIAN1, Honghui JIN1, Chao WU1, Lin YOU2, Yingxiao WU3   

  1. 1 School of Communication Engineering, Hangzhou Dianzi University, Hangzhou 310018, China
    2 School of Cyberspace Security, Hangzhou Dianzi University, Hangzhou 310018, China
    3 School of Computer, Hangzhou Dianzi University, Hangzhou 310018, China
  • Revised:2022-12-15 Online:2023-01-20 Published:2023-01-01
  • Supported by:
    The National Natural Science Foundation of China(61201301);The National Natural Science Foundation of China(61772166);The National Natural Science Foundation of China(61901154)

摘要:

针对基于局部二值模式的伪装语音检测方法的合成语音检测准确度较低的情况,提出了一种基于中心对称局部二值模式的伪装语音检测方法。该方法通过短时傅里叶变换得到语音信号的语谱图,再利用中心对称局部二值模式提取语谱图的纹理特征,并用该纹理特征训练随机森林分类器,从而实现真伪语音的判别。该方法综合考虑语谱图中像素点的数值大小和位置关系,包含了更加全面的纹理信息,并将特征维度降低至16维,有利于减少计算量。实验结果表明,在ASVspoof 2019数据集上,与传统的基于局部二值模式的伪装语音检测方法相比,所提方法将合成伪装语音的串联检测代价函数(t-DCF)降低了 16.98%,检测速度提高了89.73%。

关键词: 说话人验证, 伪装语音检测, 中心对称局部二值模式, 随机森林

Abstract:

In view of the fact that the local binary pattern (LBP) based speech spoofing detection method has low detection accuracy when detecting synthetic speech, a spoofing speech detection method based on center-symmetric local binary pattern (CSLBP) was proposed.In this method, the spectrogram of the speech signal was obtained through short-time Fourier transform (STFT), and then the texture feature was extracted from the spectrogram using the CSLBP.The random forest classifier was trained by the extracted texture feature to realize the discrimination of genuine and spoofing speech.The CSLBP-based method comprehensively considered the value and position relationship of pixels in the spectrogram so as to contain more texture information, and reduced the feature dimension to 16 beneficial to decrease the amount of computation.Experimental results on the ASVspoof 2019 dataset show that, compared with the LBP-based spoofing detection method, the proposed method reduced the tandem detection cost function (t-DCF) of synthetic spoofing speech by 16.98% and increased the detection speed by 89.73%.

Key words: speaker verification, spoofing speech detection, CSLBP, random forest

中图分类号: 

No Suggested Reading articles found!