电信科学 ›› 2018, Vol. 34 ›› Issue (5): 90-98.doi: 10.11959/j.issn.1000-0801.2018020

• 研究与开发 • 上一篇    下一篇

基于逆梅尔对数频谱系数的回放语音检测算法

林朗,王让定,严迪群,李璨   

  1. 宁波大学,浙江 宁波 315211
  • 修回日期:2017-12-07 出版日期:2018-05-01 发布日期:2018-05-30
  • 作者简介:林朗(1994-),男,宁波大学信息科学与工程学院硕士生,主要研究方向为多媒体通信与信息安全等。|王让定(1962-),男,博士,宁波大学信息科学与工程学院教授、博士生导师,主要研究方向为多媒体通信与取证、信息隐藏与隐写分析、智能抄表及传感网络技术等。|严迪群(1979-),男,博士,宁波大学信息科学与工程学院副教授、硕士生导师,主要研究方向为多媒体通信、信息安全、基于深度学习的数字语音取证等。|李璨(1992-),女,宁波大学信息科学与工程学院硕士生,主要研究方向为多媒体通信与信息安全等。
  • 基金资助:
    国家自然科学基金资助项目(61672302);国家自然科学基金资助项目(61300055);浙江省自然科学基金资助项目(LZ15F020002);浙江省自然科学基金资助项目(LY17F020010);宁波大学科研基金资助项目(XKXL1405);宁波大学科研基金资助项目(XKXL1420);宁波大学科研基金资助项目(XKXL1509);宁波大学科研基金资助项目(XKXL1503);宁波大学王宽诚幸福基金资助项目

A playback speech detection algorithm based on log inverse Mel-frequency spectral coefficient

Lang LIN,Rangding WANG,Diqun YAN,Can LI   

  1. Ningbo University,Ningbo 315211,China
  • Revised:2017-12-07 Online:2018-05-01 Published:2018-05-30
  • Supported by:
    The National Natural Science Foundation of China(61672302);The National Natural Science Foundation of China(61300055);The Natural Science Foundation of Zhejiang Province of China(LZ15F020002);The Natural Science Foundation of Zhejiang Province of China(LY17F020010);The Scientific Research Foundation of Ningbo University(XKXL1405);The Scientific Research Foundation of Ningbo University(XKXL1420);The Scientific Research Foundation of Ningbo University(XKXL1509);The Scientific Research Foundation of Ningbo University(XKXL1503);K.C.Wong Magna Fund in Ningbo University

摘要:

高保真录音设备和回放设备的普及化及便携化,给说话人识别系统的抗回放语音攻击带来了严峻挑战。通过语谱图分析原始语音和回放语音在高频区的差异,有针对性地将语音信号在求取 Mel(梅尔)倒谱系数过程中的Mel滤波器组逆置,并将DCT前的Mel对数频谱系数作为算法的特征。最后,利用支持向量机作为分类器对待测语音进行判别。实验结果表明,此算法能够有效地检测回放语音。另外,将此算法加载到GMM-UBM说话人识别系统后,显著地提升了系统的抗回放语音攻击能力。

关键词: 说话人识别, 回放语音检测, 梅尔对数频谱, 逆梅尔滤波器组

Abstract:

The popularity and portability of high-fidelity audio recording equipment and playback equipment poses a serious challenge for speaker recognition systems against playback attacks.Based on the differences between the original speech and the playback speech in high frequency region,the algorithm reversed the Mel-filter bank in Mel-frequency cepstral coefficient (MFCC) calculation,and the coefficients before the DCT were used as the features of the algorithm.SVM was utilized as the classifier.Experimental results show that this algorithm can effectively detect the playback speech.In addition,the algorithm is integrated into the GMM-UBM speaker recognition system,which significantly improves the systems’ capability of resisting the playback attack.

Key words: speaker recognition, playback speech detection, log Mel-frequency spectrum, inverse Mel-filter group

中图分类号: 

No Suggested Reading articles found!