基于联合特征与随机森林的伪装语音检测

doi:10.11959/j.issn.1000-0801.2022089

电信科学 ›› 2022, Vol. 38 ›› Issue (6): 91-99.doi: 10.11959/j.issn.1000-0801.2022089

基于联合特征与随机森林的伪装语音检测

于佳祺¹, 简志华¹, 徐嘉¹, 游林², 汪云路², 吴超¹

¹ 杭州电子科技大学通信工程学院，浙江杭州 310018
² 杭州电子科技大学网络空间安全学院，浙江杭州 310018

修回日期:2022-05-15 出版日期:2022-06-20 发布日期:2022-06-01
作者简介:于佳祺（1997- ），男，杭州电子科技大学通信工程学院硕士生，主要研究方向为语音伪装检测、特征提取与分析
简志华（1978- ），男，博士，杭州电子科技大学通信工程学院副教授、硕士生导师，主要研究方向为语音转换、伪装语音检测、声纹识别等
徐嘉（1998- ），女，杭州电子科技大学通信工程学院硕士生，主要研究方向为语音伪装及检测
游林（1966- ），男，博士，杭州电子科技大学网络空间安全学院教授、硕士生导师，主要研究方向为生物信息处理、信息安全、密码学等
汪云路（1980- ），女，博士，杭州电子科技大学网络空间安全学院讲师，主要研究方向为音频信息处理、信息隐藏
吴超（1988- ），男，博士，杭州电子科技大学通信工程学院讲师，主要研究方向为导航信号处理及欺骗干扰检测
基金资助:
国家自然科学基金资助项目(61201301);国家自然科学基金资助项目(61772166);国家自然科学基金资助项目(61901154)

Spoofing speech detection algorithm based on joint feature and random forest

Jiaqi YU¹, Zhihua JIAN¹, Jia XU¹, Lin YOU², Yunlu WANG², Chao WU¹

¹ School of Communication Engineering, Hangzhou Dianzi University, Hangzhou 310018, China
² School of Cyberspace Security, Hangzhou Dianzi University, Hangzhou 310018, China

Revised:2022-05-15 Online:2022-06-20 Published:2022-06-01
Supported by:
The National Natural Science Foundation of China(61201301);The National Natural Science Foundation of China(61772166);The National Natural Science Foundation of China(61901154)

摘要/Abstract

摘要：

为了能较为全面地描述语音信号的特征信息，提高伪装检测率，提出了一种基于均匀局部二值模式纹理特征与常数Q倒谱系数声学特征相结合，并以随机森林为分类模型的伪装语音检测方法。利用均匀局部二值模式提取语音信号语谱图中的纹理特征矢量，并与常数Q倒谱系数构成联合特征，再用所获得的联合特征矢量训练随机森林分类器，从而实现了伪装语音检测。实验中，分别对其他特征参数以及支持向量机分类器模型所构建的几种伪装检测系统进行了性能对照，结果表明，所提联合特征与随机森林模型相结合的语音伪装检测系统具有最优的检测性能。

关键词: 伪装语音检测, 声学特征, 纹理特征, 均匀局部二值模式, 随机森林

Abstract:

In order to describe the characteristic information of the speech signal more comprehensively and improve the detection rate of camouflage, a spoofing speech detection method based on the combination of uniform local binary pattern texture feature and constant Q cepstrum coefficient acoustic feature was proposed, which used random forest as the classifier model.The texture feature vector in the speech signal spectrogram was extracted by using the uniform local binary mode, and the joint feature was formed with the constant Q cepstrum coefficient.Then, the obtained joint feature vector was used to train the random forest classifier, so as to realize the camouflage speech detection.In the experiment, the performances of several spoofing detection systems constructed by other feature parameters and the support vector machine classifier model were compared, and the results show that the proposed speech spoofing detection system combined with the joint feature and the random forest model has the best performance.

Key words: spoofing speech detection, acoustic feature, texture feature, uniform local binary pattern, random forest

中图分类号:

TP391.42

于佳祺, 简志华, 徐嘉, 游林, 汪云路, 吴超. 基于联合特征与随机森林的伪装语音检测[J]. 电信科学, 2022, 38(6): 91-99.

Jiaqi YU, Zhihua JIAN, Jia XU, Lin YOU, Yunlu WANG, Chao WU. Spoofing speech detection algorithm based on joint feature and random forest[J]. Telecommunications Science, 2022, 38(6): 91-99.

图/表 8

图1

图2

图3

图4

表1

图5

图6

表2

参考文献 27

[1]	GOMEZ-ALANIS A , GONZALEZ-LOPEZ J A , PEINADO A M . A kernel density estimation based loss function and its application to ASV-spoofing detection[J]. IEEE Access, 2020,8: 108530-108543.
[2]	肜娅峰, 陈晨, 陈德运 ,等. 基于贝叶斯主成分分析的i-vector说话人确认方法[J]. 电子学报, 2021,49(11): 2186-2194.
	RONG Y F , CHEN C , CHEN D Y ,et al. Bayesian principal component analysis for I-vector speaker verification[J]. Acta Electronica Sinica, 2021,49(11): 2186-2194.
[3]	LI N , MAK M W , CHIEN J T . Deep neural network driven mixture of PLDA for robust i-vector speaker verification[C]// Proceedings of 2016 IEEE Spoken Language Technology Workshop. Piscataway:IEEE Press, 2016: 186-191.
[4]	ALEGRE F , JANICKI A , EVANS N . re-assessing the threat of replay spoofing attacks against automatic speaker verification[C]// Proceedings of 2014 International Conference of the Biometrics Special Interest Group (BIOSIG). Piscataway:IEEE Press, 2014: 1-6.
[5]	林朗, 王让定, 严迪群 ,等. 基于逆梅尔对数频谱系数的回放语音检测算法[J]. 电信科学, 2018,34(5): 90-98.
	LIN L , WANG R D , YAN D Q ,et al. A playback speech detection algorithm based on log inverse Mel-frequency spectral coefficient[J]. Telecommunications Science, 2018,34(5): 90-98.
[6]	NAUTSCH A , WANG X , EVANS N ,et al. ASVspoof 2019:spoofing countermeasures for the detection of synthesized,converted and replayed speech[J]. IEEE Transactions on Biometrics,Behavior,and Identity Science, 2021,3(2): 252-265.
[7]	任延珍, 刘晨雨, 刘武洋 ,等. 语音伪造及检测技术研究综述[J]. 信号处理, 2021,37(12): 2412-2439.
	REN Y Z , LIU C Y , LIU W Y ,et al. A survey on speech forgery and detection[J]. Journal of Signal Processing, 2021,37(12): 2412-2439.
[8]	YU H , TAN Z H , MA Z Y ,et al. Spoofing detection in automatic speaker verification systems using DNN classifiers and dynamic acoustic features[J]. IEEE Transactions on Neural Networks and Learning Systems, 2018,29(10): 4633-4644.
[9]	PAUL D , PAL M , SAHA G . Novel speech features for improved detection of spoofing attacks[C]// Proceedings of 2015 Annual IEEE India Conference. Piscataway:IEEE Press, 2015: 1-6.
[10]	HIDAYAT R , BEJO A , SUMARYONO S ,et al. Denoising speech for MFCC feature extraction using wavelet transformation in speech recognition system[C]// Proceedings of 2018 10th International Conference on Information Technology and Electrical Engineering (ICITEE). Piscataway:IEEE Press, 2018: 280-284.
[11]	?ZS?NMEZ D B , ACARMAN T , PARLAK ? B , . Optimal classifier selection in Turkish speech emotion detection[C]// Proceedings of 2021 29th Signal Processing and Communications Applications Conference (SIU). Piscataway:IEEE Press, 2021: 1-4.
[12]	PENG X , LU C Y , YI Z ,et al. Connections between nuclear-norm and frobenius-norm-based representations[J]. IEEE Transactions on Neural Networks and Learning Systems, 2018,29(1): 218-224.
[13]	TODISCO M , DELGADO H , EVANS N . Constant Q cepstral coefficients:a spoofing countermeasure for automatic speaker verification[J]. Computer Speech ＆ Language, 2017(45): 516-535.
[14]	SARANYA S , BHARATHI B , KAVITHA S . An approach to detect replay attack in automatic speaker verification system[C]// Proceedings of 2018 International Conference on Computer,Communication,and Signal Processing (ICCCSP). Piscataway:IEEE Press, 2018: 1-5.
[15]	YE Y C , LAO L J , YAN D Q ,et al. Detection of replay attack based on normalized constant Q cepstral feature[C]// Proceedings of 2019 IEEE 4th International Conference on Cloud Computing and Big Data Analysis. Piscataway:IEEE Press, 2019: 407-411.
[16]	MASSOUDI M , VERMA S , JAIN R . Urban sound classification using CNN[C]// Proceedings of 2021 6th International Conference on Inventive Computation Technologies (ICICT). Piscataway:IEEE Press, 2021: 583-589.
[17]	LI P H , LI Y Y , LUO D C ,et al. Speaker identification using FrFT-based spectrogram and RBF neural network[C]// Proceedings of 2015 34th Chinese Control Conference (CCC). Piscataway:IEEE Press, 2015: 3674-3679.
[18]	WANG J , HAN Z Y . Research on speech emotion recognition technology based on deep and shallow neural network[C]// Proceedings of 2019 Chinese Control Conference (CCC). Piscataway:IEEE Press, 2019: 3555-3558.
[19]	徐剑, 简志华, 于佳祺 ,等. 采用完整局部二进制模式的伪装语音检测[J]. 电信科学, 2021,37(5): 91-99.
	XU J , JIAN Z H , YU J Q ,et al. Completed local binary pattern based speech anti-spoofing[J]. Telecommunications Science, 2021,37(5): 91-99.
[20]	K L , DABHADE S B , RODE Y S ,et al. Identification of breast cancer from thermal imaging using SVM and random forest method[C]// Proceedings of 2021 5th International Conference on Trends in Electronics and Informatics (ICOEI). Piscataway:IEEE Press, 2021: 1346-1349.
[21]	TAO Y , HE Y Z . Face recognition based on LBP algorithm[C]// Proceedings of 2020 International Conference on Computer Network,Electronic and Automation (ICCNEA). Piscataway:IEEE Press, 2020: 21-25.
[22]	OJALA T , PIETIKAINEN M , MAENPAA T . Multiresolution gray-scale and rotation invariant texture classification with local binary patterns[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002,24(7): 971-987.
[23]	FAUDZI S A A M , YAHYA N . Evaluation of LBP-based face recognition techniques[C]// Proceedings of 2014 5th International Conference on Intelligent and Advanced Systems (ICIAS). Piscataway:IEEE Press, 2014: 1-6.
[24]	WANG L L , . Research on distributed parallel dimensionality reduction algorithm based on PCA algorithm[C]// Proceedings of 2019 IEEE 3rd Information Technology,Networking,Electronic and Automation Control Conference. Piscataway:IEEE Press, 2019: 1363-1367.
[25]	WANG X , YAMAGISHI J , TODISCO M ,et al. ASVspoof 2019:a large-scale public database of synthesized,converted and replayed speech[J]. Computer Speech ＆ Language, 2020,64:101114.
[26]	WU Z Z , KINNUNEN T , EVANS N ,et al. ASVspoof 2015:the first automatic speaker verification spoofing and countermeasures challenge[C]// Proceedings of Interspeech 2015. ISCA:ISCA, 2015.
[27]	CHENG X L , XU M X , ZHENG T F . Replay detection using CQT-based modified group delay feature and ResNeWt network in ASVspoof 2019[C]// Proceedings of 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC). Piscataway:IEEE Press, 2019: 540-545.

伪装类型	SVM		RF
伪装类型	MFCC	CQCC	MFCC	CQCC
A01	0.599	0.017	0.438	0.011
A02	0.325	0.014	0.22	0.004
A03	0.415	0.248	0.22	0.112
A04	0.414	0.153	0.326	0.018
A05	0.272	0.097	0.242	0.094
A06	0.234	0.126	0.171	0.112
A07	0.503	0.007	0.343	0.007
A08	0.359	0.024	0.253	0.033
A09	0.457	0.063	0.364	0.028
A10	0.368	0.003	0.208	0.022
A11	0.466	0.024	0.244	0.015
A12	0.428	0.093	0.375	0.015
A13	0.282	0.007	0.255	0.003
A14	0.382	0.014	0.339	0.052
A15	0.298	0.07	0.199	0.06
A16	0.417	0.037	0.244	0.018
A17	0.332	0.09	0.212	0.123
A18	0.28	0.248	0.202	0.211
A19	0.265	0.107	0.179	0.092
平均值	0.373	0.076	0.265	0.054

检测系统	平均用时/s
SVM	1 007.142
RF	0.106 7

基于联合特征与随机森林的伪装语音检测

Spoofing speech detection algorithm based on joint feature and random forest

在线阅读

PDF下载

可视化

摘要/Abstract

引用本文

使用本文

图/表 8

参考文献 27

相关文章 9

Metrics

推荐阅读 0

[1]	徐嘉, 简志华, 金宏辉, 吴超, 游林, 吴迎笑. 基于中心对称局部二值模式的合成伪装语音检测方法[J]. 电信科学, 2023, 39(1): 72-78.
[2]	卢子萌,陈佳怡,李璟,谢岳,蒋欣利,韩蕾,郭倩. 基于加权随机森林算法的空巢电力用户识别方法[J]. 电信科学, 2020, 36(8): 112-121.
[3]	张溶芳,许丹丹,王元光,潘思宇,李正茂. 机器学习在物联网虚假用户识别中的运用[J]. 电信科学, 2019, 35(7): 136-144.
[4]	文鹏,彭宗举,陈芬,蒋刚毅,郁梅. 基于随机森林的HEVC复杂度控制方法[J]. 电信科学, 2019, 35(2): 14-26.
[5]	杜续,冯景瑜,吕少卿,石薇. 基于随机森林回归分析的PM2.5浓度预测模型[J]. 电信科学, 2017, 33(7): 66-75.
[6]	王彦青,王瀚辰. 一种识别骚扰电话的组合算法研究[J]. 电信科学, 2017, 33(7): 112-119.
[7]	李倩,江昊,杨锦涛. 基于手机上网记录数据的个体相遇预测[J]. 电信科学, 2017, 33(10): 115-123.
[8]	刘歌,张国毅,于岩. 基于随机森林的雷达信号脉内调制识别[J]. 电信科学, 2016, 32(5): 69-78.
[9]	王铮,任华,方燕萍. 随机森林在运营商大数据补全中的应用[J]. 电信科学, 2016, 32(12): 7-12.