基于逆梅尔对数频谱系数的回放语音检测算法

doi:10.11959/j.issn.1000-0801.2018020

Abstract

Abstract:

The popularity and portability of high-fidelity audio recording equipment and playback equipment poses a serious challenge for speaker recognition systems against playback attacks.Based on the differences between the original speech and the playback speech in high frequency region,the algorithm reversed the Mel-filter bank in Mel-frequency cepstral coefficient (MFCC) calculation,and the coefficients before the DCT were used as the features of the algorithm.SVM was utilized as the classifier.Experimental results show that this algorithm can effectively detect the playback speech.In addition,the algorithm is integrated into the GMM-UBM speaker recognition system,which significantly improves the systems’ capability of resisting the playback attack.

Key words: speaker recognition, playback speech detection, log Mel-frequency spectrum, inverse Mel-filter group

CLC Number:

TN912.3

Lang LIN,Rangding WANG,Diqun YAN,Can LI. A playback speech detection algorithm based on log inverse Mel-frequency spectral coefficient[J]. Telecommunications Science, 2018, 34(5): 90-98.

Figures/Tables 16

References 17

[1]	ZHU D , MA B , LI H . Speaker verification with feature-space MAPLR parameters[J]. IEEE Transactions on Audio Speech ＆Language Processing, 2011,19(3): 505-515.
[2]	易克初, 胡征 . 一种应用矢量量化的语音合成新方法[J]. 电信科学, 1987(11): 1-6.
	YI K C , HU Z . A new speech synthesis method using vector quantization[J]. Telecommunications Science, 1987(11): 1-6.
[3]	郭弘 . 录音证据的真实性检验与研究[J]. 电信科学, 2010,26(Z2): 56-60.
	GUO H . Authenticity verification and research of recording evidence[J]. Telecommunications Science, 2010,26(Z2): 56-60.
[4]	李璨, 王让定, 严迪群 ,等. 基于相位谱的翻录语音攻击检测算法[J]. 电信科学, 2017,33(8): 145-154.
	LI C , WANG R D , YAN D Q ,et al. Detection algorithm of riprap voice attack based on phase spectrum[J]. Telecommunications Science, 2017,33(8): 145-154.
[5]	SHANG W , STEVENSON M . A playback attack detector for speaker verification systems[C]// IEEE International Symposium on Communications Control and Signal Processing (ISCCSP),March 12-14,2008,St Julians,Malta. Piscataway:IEEE Press, 2008: 1144-1149.
[6]	SHANG W , STEVENSON M . Score normalization in playback attack detection[C]// IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP),March 14-19,2010,Dallas,USA. Piscataway:IEEE Press, 2010: 1678-1681.
[7]	张利鹏, 曹犟, 徐明星 . 防止假冒者闯入说话人识别系统[J]. 清华大学学报(自然科学版), 2008,48(S1): 699-703.
	ZHANG L P , CAO J , XU M X . Prevention of impostors entering speaker recognition systems[J]. Journal of Tsinghua University (Science and Technology), 2008,48(S1): 699-703.
[8]	王志峰, 贺前华, 张雪源 ,等. 基于模式噪声的录音回放攻击检测[J]. 华南理工大学学报, 2011,39(10): 7-12.
	WANG Z F , HE Q H , ZHANG X Y ,et al. Channel pattern noise based playback detection algorithm speaker recognition[J]. Journal of South China University of Technology (Natural Science Edition), 2011,39(10): 7-12.
[9]	李富强, 万红, 黄俊杰 . 基于MATLAB的语谱图显示与分析[J]. 微计算机信息, 2005(20): 172-174.
	LI F Q , WAN H , HUANG J J . The display and analysis of sonogram based on MATLAB[J]. Control ＆ Automation, 2005(20): 172-174.
[10]	BURILLO P , BUSTINCE H . Entropy on intuitionistic fuzzy sets and on interval-valued fuzzy sets[J]. Fuzzy Sets ＆ Systems, 1996,78(3): 305-316.
[11]	项要杰, 杨俊安, 李晋徽 ,等. 一种适用于说话人识别的改进Mel滤波器[J]. 计算机工程, 2013(11): 214-217.
	XIANG Y J , YANG J A , LI J H ,et al. An improved Mel-frequency filter for speaker recognition[J]. Computer Engineering, 2013(11): 214-217.
[12]	陶佰睿, 郭琴, 苗凤娟 ,等. 基于改进 Mel 滤波器组的声纹特征提取SoC设计[J]. 微电子学, 2015(6): 785-788.
	TAO B R , GUO Q , MIAO F J ,et al. SoC design of voiceprint features extraction based on improved Mel filter banks[J]. Microelectronics, 2015(6): 785-788.
[13]	胡永刚, 吴翊, 王洪志 ,等. 高维数据降维的 DCT 变换[J]. 计算机工程与应用, 2006(32): 21-23.
	HU Y G , WU Y , WANG H Z ,et al. Discrete cosine transform in data dimensionality reduction[J]. Computer Engineering and Applications, 2006(32): 21-23.
[14]	MOHAMED A . Deep neural network acoustic models for ASR[J]. Doctoral, 2014
[15]	CHANG C C , LIN C J . LIBSVM:a library for support vector machines[J]. ACM Transactions on Intelligent Systems ＆Technology, 2012,2(3): 1-27.
[16]	王天庆, 李爱军 . 连续汉语语音识别语料库的设计[C]// 第六届全国现代语音学学术会议论文集,2003年10月1日,天津,中国. [出版地不详:出版者不详], 2003: 1-4.
	WANG T Q , LI A J . The design of the continuous Chinese speech recognition corpus[C]// The Sixth National Conference on Modern Phonetics Learning,Oct 1,2003,Tianjin,China.[S.l.:s.n]. 2003: 1-4.
[17]	CHAKROBORTY S , ROY A , SAHA G . Improved closed setttext-independent speaker identification by combining MFCC with evidence from flipped filter banks[J]. International Journal of Signal Processing, 2007,4(2): 114-122.

Metrics

Recommended 0

No Suggested Reading articles found!

类别	原始录制设备	偷录设备			回放设备
类别	Aigo R6620	iPhone6	Mi4	Sony PX440	Huawei AM08	Philips DTM3115
语音格式	wav	m4a	mp3	mp3	—	—
参数	16 kHz	44.1 kHz	44.1 kHz	44.1 kHz	—	—
	16 bit/s	64 kbit/s	128 kbit/s	192 kbit/s

语音	原始录制设备	回放设备	偷录设备	样本数/个
原始语音	Aigo R6620	—	—	2 400
回放语音	Aigo R6620	Huawei AM08	iPhone6、Mi4、Sony PX440	6 300
		Philips DTM3115	iPhone6、Mi4、Sony PX440	6 300

特征	Philips DTM3115			Huawei AM08			两种设备的交叉
特征	FPR	TPR	ACC	FPR	TPR	ACC	FPR	TPR	ACC
MFCC	99.60%	1.30%	99.58%	96.90%	7.00%	96.92%	96.70%	16.90%	96.67%
I-MFCC	99.90%	0.20%	99.92%	98.2%	3.70%	98.16%	97.30%	14.00%	97.29%
MFSC	100%	0	100%	99.30%	0.20%	99.33%	99.70%	0.30%	99.67%
I-MFSC	100%	0	100%	100%	0	100%	99.90%	0.90%	99.86%

	回放设备	偷录设备	测试集
			Huawei AM08			Philips DTM3115
			iPhone	Mi	Sony	iPhone	Mi	Sony
训练集	Huawei AM08	iPhone	100%	100%	100%	100%	100%	99.86%
		Mi	98.43%	100%	96.79%	97.86%	100%	82%
		Sony	100%	100%	100%	99.07%	99.50%	99.79%
	Philips DTM3115	iPhone	99.85%	100%	92.79%	100%	100%	99.85%
		Mi	96.70%	99%	77.29%	99.93%	100%	72.79%
		Sony	100%	99.14%	100%	100%	97.04%	100%

算法	纯净条件	30 dB噪声	25 dB噪声	20 dB噪声	15 dB噪声
MFCC	96.67%	95.66%	90.89%	87.74%	85.71%
I-MFCC	97.29%	96.57%	95.52%	90.89%	88.81%
MFSC	99.67%	98.62%	98.23%	97.57%	96.57%
I-MFSC	99.86%	99.35%	98.95%	98.21%	97.43%

A playback speech detection algorithm based on log inverse Mel-frequency spectral coefficient

RichHTML

PDF下载

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 16

References 17

Related Articles 1

Metrics

Recommended 0

算法	ACC	EER
参考文献[4]算法	75.42%	25.45%
参考文献[5]算法	83.23%	19.09%
本文算法	99.86%	5.90%