Journal on Communications ›› 2022, Vol. 43 ›› Issue (12): 211-221.doi: 10.11959/j.issn.1000-436x.2022234
• Correspondences • Previous Articles Next Articles
Hua LONG, Zhangheng HUANG, Yubin SHAO, Qingzhi DU, Shumeng SU
Revised:
2022-11-30
Online:
2022-12-25
Published:
2022-12-01
Supported by:
CLC Number:
Hua LONG, Zhangheng HUANG, Yubin SHAO, Qingzhi DU, Shumeng SU. Research on language recognition algorithm based on improved CFCC feature extraction[J]. Journal on Communications, 2022, 43(12): 211-221.
"
特征函数 | 听觉特性函数 | 识别准确率 | 平均识别准确率 | ||||
-5 dB | 0 dB | 5 dB | 10 dB | 15 dB | |||
CFCC | 13 | 66.8% | 70.76% | 72.77% | 74.16% | 79.34% | 72.77% |
LCFCC | 对数 | 63.73% | 68.6% | 74.83% | 75.06% | 78.7% | 72.18% |
CFCC0 | 0.101 | 67.73% | 71.46% | 75.36% | 80.96% | 83.46% | 75.79% |
CFCC1 | 115 | 65.63% | 73.8% | 76.86% | 78.43% | 80.1% | 74.96% |
FCFCC | 0.25 | 68.97% | 73.4% | 77.5% | 80.36% | 84.63% | 76.97% |
"
特征 | 分类网络 | 识别准确率 | 平均识别准确率 | ||||
-5 dB | 0 dB | 5 dB | 10 dB | 15 dB | |||
NFCFCCAF | FcaNet-MobileNetV2 | 75.5% | 81.87% | 83.66% | 85.8% | 88.42% | 83.05% |
MobileNetV2 | 73.62% | 79.7% | 82.33% | 84.37% | 85.2% | 81.04% | |
ResNet | 74.2% | 80.36% | 81.9% | 84.5% | 85.63% | 81.30% | |
NFCFCCAF-DS | FcaNet-MobileNetV2 | 81.2% | 82.97% | 85.93% | 87.8% | 90.37% | 85.65% |
MobileNetV2 | 79.5% | 80.63% | 83.82% | 85.97% | 88.1% | 83.60% | |
ResNet | 77.16% | 80.2% | 81.2% | 84.54% | 88.25% | 82.27% |
[1] | IRTZA S , SETHU V , AMBIKAIRAJAH E ,et al. Using language cluster models in hierarchical language identification[J]. Speech Communication, 2018,100: 30-40. |
[2] | 苗晓晓, 徐及, 王剑 . 基于降噪自动编码器的语种特征补偿方法[J]. 计算机研究与发展, 2019,56(5): 1082-1091. |
MIAO X X , XU J , WANG J . Denoising auto encoder-based language feature compensation[J]. Journal of Computer Research and Development, 2019,56(5): 1082-1091. | |
[3] | DAVIS S , MERMELSTEIN P . Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences[J]. IEEE Transactions on Acoustics,Speech,and Signal Processing, 1980,28(4): 357-366. |
[4] | 龙华, 杨明亮, 邵玉斌 . 基于特征流融合的带噪语音检测算法[J]. 通信学报, 2020,41(4): 134-142. |
LONG H , YANG M L , SHAO Y B . Noisy voice detection algorithm based on feature stream fusion[J]. Journal on Communications, 2020,41(4): 134-142. | |
[5] | QI J , WANG D , JIANG Y ,et al. Auditory features based on Gammatone filters for robust speech recognition[C]// Proceedings of 2013 IEEE International Symposium on Circuits and Systems. Piscataway:IEEE Press, 2013: 305-308. |
[6] | LI Q , HUANG Y . Robust speaker identification using an auditory-based feature[C]// Proceedings of 2010 IEEE International Conference on Acoustics,Speech and Signal Processing. Piscataway:IEEE Press, 2010: 4514-4517. |
[7] | LI Q , HUANG Y . An auditory-based feature extraction algorithm for robust speaker identification under mismatched conditions[J]. IEEE Transactions on Audio,Speech,and Language Processing, 2011,19(6): 1791-1801. |
[8] | 刘影, 韩康康, 钱志鸿 . 基于声音空间梯度的高稳健性击键识别方法[J]. 通信学报, 2020,41(5): 96-103. |
LIU Y , HAN K K , QIAN Z H . High-roubustness keystroke recognition method based on acoustic spatial gradient[J]. Journal on Communications, 2020,41(5): 96-103. | |
[9] | 李晶皎, 安冬, 杨丹 ,等. 噪声环境下说话人识别的TEO-CFCC特征参数提取方法[J]. 计算机科学, 2012,39(12): 195-197. |
LI J J , AN D , YANG D ,et al. TEO-CFCC characteristic parameter extraction method for speaker recognition in noisy environments[J]. Computer Science, 2012,39(12): 195-197. | |
[10] | 李作强, 高勇 . 基于CFCC和相位信息的鲁棒性说话人辨识[J]. 计算机工程与应用, 2015,51(17): 228-232. |
LI Z Q , GAO Y . Robust speaker identification based on CFCC and phase information[J]. Computer Engineering and Applications, 2015,51(17): 228-232. | |
[11] | PATEL T B , PATIL H A . Cochlear filter and instantaneous frequency based features for spoofed speech detection[J]. IEEE Journal of Selected Topics in Signal Processing, 2017,11(4): 618-631. |
[12] | 白静, 史燕燕, 薛珮芸 ,等. 融合非线性幂函数和谱减法的 CFCC特征提取[J]. 西安电子科技大学学报, 2019,46(1): 86-92. |
BAI J , SHI Y Y , XUE P Y ,et al. CFCC feature extraction for fusion of the power-law nonlinearity function and spectral subtraction[J]. Journal of Xidian University, 2019,46(1): 86-92. | |
[13] | 吴龙文, 聂雨亭, 张宇鹏 ,等. 基于变分模态分解的自适应滤波降噪方法[J]. 电子学报, 2021,49(8): 1457-1465. |
WU L W , NIE Y T , ZHANG Y P ,et al. An adaptive filtering denoising method based on variational mode decomposition[J]. Acta Electronica Sinica, 2021,49(8): 1457-1465. | |
[14] | GUO Y,etal . Novel fractional wavelet transform:principles,MRA and application[J]. Digital Signal Processing, 2021,110:102937. |
[15] | IRINO T , PATTERSON R D . A dynamic compressive gammachirp auditory filterbank[J]. IEEE Transactions on Audio,Speech,and Language Processing, 2006,14(6): 2222-2232. |
[16] | SHAO Y , JIN Z Z , WANG D L ,et al. An auditory-based feature for robust speech recognition[C]// Proceedings of 2009 IEEE International Conference on Acoustics,Speech and Signal Processing. Piscataway:IEEE Press, 2009: 4625-4628. |
[17] | LV H , SHAN P F , SHI H F ,et al. An adaptive bilateral filtering method based on improved convolution kernel used for infrared image enhancement[J]. Signal,Image and Video Processing, 2022,16(8): 2231-2237. |
[18] | 史军, 张乃通, 刘晓萍 . 一种新型分数阶小波变换及其应用[J]. 中国科学:信息科学, 2012,42(2): 125-135. |
SHI J , ZHANG N T , LIU X P . A novel fractional wavelet transform and its applications[J]. Scientia Sinica (Informationis), 2012,42(2): 125-135. | |
[19] | ZHOU T Y , ZHAO Y , WU J . ResNeXt and Res2Net structures for speaker verification[C]// Proceedings of 2021 IEEE Spoken Language Technology Workshop. Piscataway:IEEE Press, 2021: 301-307. |
[20] | SANDLER M , HOWARD A , ZHU M L ,et al. MobileNetV2:inverted residuals and linear bottlenecks[C]// Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2018: 4510-4520. |
[21] | QIN Z Q , ZHANG P Y , WU F ,et al. FcaNet:frequency channel attention networks[C]// Proceedings of 2021 IEEE/CVF International Conference on Computer Vision (ICCV). Piscataway:IEEE Press, 2021: 763-772. |
[22] | HU J , SHEN L , SUN G . Squeeze-and-excitation networks[C]// Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2018: 7132-7141. |
[23] | 陈宗阳, 赵辉, 吕永胜 ,等. 基于改进 MobileNetV2 网络的涂层表面缺陷识别方法[J]. 哈尔滨工程大学学报, 2022,43(4): 572-579. |
CHEN Z Y , ZHAO H , LYU Y S ,et al. A recognition method of coating surface defects based on the improved MobileNetV2 network[J]. Journal of Harbin Engineering University, 2022,43(4): 572-579. | |
[24] | 陈亮, 邵玉斌, 龙华 ,等. 基于时域Gammatone滤波特征的广播语种识别[J]. 信号处理, 2022,38(3): 599-608. |
CHEN L , SHAO Y B , LONG H ,et al. Language identification for broadcasting signal based on time-domain gammatone filtering features[J]. Journal of Signal Processing, 2022,38(3): 599-608. | |
[25] | 曾金芳, 徐文涛, 黄费贞 . 基于耳蜗倒谱系数的说话人识别[J]. 电子技术与软件工程, 2020,5: 85-86. |
ZENG JF , XU W T , HUANG F Z . Speaker recognition based on cochlear filter cepstral coefficients[J]. Electronic Technology and Software Engineering, 2020,5: 85-86. |
[1] | Jinyin CHEN, Haiyang XIONG, Haonan MA, Yayu ZHENG. CLB-Defense: based on contrastive learning defense for graph neural network against backdoor attack [J]. Journal on Communications, 2023, 44(4): 154-166. |
[2] | Jianfeng LI, Zheyu LIU, Yang RONG, Zhan LI, Bolin LIAO, Linxi QU, Zhijie LIU, Kunhuang LIN. Zeroing neural network for time-varying convex quadratic programming with linear noise [J]. Journal on Communications, 2023, 44(4): 226-233. |
[3] | Yun LIN, Huaitao XU, Sen WANG, Sicheng ZHANG, Long ZHUANG. Objective assessment of communication speech interference effect based on feature fusion [J]. Journal on Communications, 2023, 44(3): 105-116. |
[4] | Hongyu YANG, Haiyun YANG, Liang ZHANG, Xiang CHENG. Feature dependence graph based source code loophole detection method [J]. Journal on Communications, 2023, 44(1): 103-117. |
[5] | Rui JIANG, Jun LI, Youyun XU, Xiaoming WANG, Dapeng LI. Fault tolerant GPS-AOA-SINS integrated navigation algorithm based on federated Kalman filter [J]. Journal on Communications, 2022, 43(8): 78-89. |
[6] | Shiwen HE, Jun YUAN, Zhenyu AN, Min ZHANG, Yongming HUANG, Yaoxue ZHANG. GNN-based optimization algorithm for joint user scheduling and beamforming [J]. Journal on Communications, 2022, 43(7): 73-84. |
[7] | Tao LENG, Lijun CAI, Aimin YU, Ziyuan ZHU, Jian’gang MA, Chaofei LI, Ruicheng NIU, Dan MENG. Review of threat discovery and forensic analysis based on system provenance graph [J]. Journal on Communications, 2022, 43(7): 172-188. |
[8] | Yurong LIAO, Haining WANG, Cunbao LIN, Yang LI, Yuqiang FANG, Shuyan NI. Research progress of deep learning-based object detection of optical remote sensing image [J]. Journal on Communications, 2022, 43(5): 190-203. |
[9] | Fan ZHANG, Yun HUANG, Zizhuo FANG, Wei GUO. Lost-minimum post-training parameter quantization method for convolutional neural network [J]. Journal on Communications, 2022, 43(4): 114-122. |
[10] | Zhengyu ZHU, Gengwang HOU, Chongwen HUANG, Gangcan SUN, Wanming HAO, Jing LIANG. Systems resource allocation algorithm for RIS-assisted D2D secure communication based on parallel CNN [J]. Journal on Communications, 2022, 43(3): 172-179. |
[11] | Junyan HUO, Danni WANG, Yanzhuo MA, Shuai WAN, Fuzheng YANG. Efficient cross-component prediction for H.266/VVC based on lightweight fully connected networks [J]. Journal on Communications, 2022, 43(2): 143-155. |
[12] | Zhengyu ZHU, Pengfei CHEN, Zixuan WANG, Kexian GONG, Di WU, Zhongyong WANG. Short wave protocol signals recognition based on Swin-Transformer [J]. Journal on Communications, 2022, 43(11): 127-135. |
[13] | Jinbo XIONG, Yongjie ZHOU, Renwan BI, Liang WAN, Youliang TIAN. Towards edge-collaborative, lightweight and privacy-preserving classification framework [J]. Journal on Communications, 2022, 43(1): 127-137. |
[14] | Yiteng WU, Wei LIU, Hongtao YU. Label flipping adversarial attack on graph neural network [J]. Journal on Communications, 2021, 42(9): 65-74. |
[15] | Changyin SUN, Liyan LIU, Fan JIANG, Jing JIANG. DNN-based Sub-6 GHz assisted millimeter wave network power allocation algorithm [J]. Journal on Communications, 2021, 42(9): 184-193. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||
|