通信学报 ›› 2022, Vol. 43 ›› Issue (12): 211-221.doi: 10.11959/j.issn.1000-436x.2022234

• 学术通信 • 上一篇    下一篇

基于改进CFCC特征提取的语种识别算法研究

龙华, 黄张衡, 邵玉斌, 杜庆治, 苏树盟   

  1. 昆明理工大学信息工程与自动化学院,云南 昆明 650500
  • 修回日期:2022-11-30 出版日期:2022-12-25 发布日期:2022-12-01
  • 作者简介:龙华(1963- ),女,回族,云南大理人,博士,昆明理工大学教授,主要研究方向为无线网络及音频信号处理、语种识别等
    黄张衡(1997- ),男,彝族,云南曲靖人,昆明理工大学硕士生,主要研究方向为音频信号处理、语种识别等
    邵玉斌(1970- ),男,云南曲靖人,昆明理工大学教授,主要研究方向为移动通信和个人通信系统以及信号处理
    杜庆治(1977- ),男,云南楚雄人,昆明理工大学副教授,主要研究方向为语音信号处理、语种识别
    苏树盟(1996- ),男,云南保山人,昆明理工大学硕士生,主要研究方向为音频信号处理、语音识别
  • 基金资助:
    国家自然科学基金资助项目(61761025)

Research on language recognition algorithm based on improved CFCC feature extraction

Hua LONG, Zhangheng HUANG, Yubin SHAO, Qingzhi DU, Shumeng SU   

  1. Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China
  • Revised:2022-11-30 Online:2022-12-25 Published:2022-12-01
  • Supported by:
    The National Natural Science Foundation of China(61761025)

摘要:

针对在低信噪比下语种识别准确率低的问题,提出一种基于分数阶小波变换的语种识别算法。首先,在特征提取前端采用自适应滤波法对带噪信号进行噪声滤除,以减小噪声对特征提取的影响,提升系统对带噪信号的处理能力。其次,采用新型分数阶小波变换作为小波基函数来模拟信号在耳蜗基底膜上的传播过程,利用非线性幂函数对信号进行压缩处理。最后,通过模拟人耳听觉过程提取改进耳蜗滤波器倒谱系数(CFCC)。实验结果表明,改进CFCC与传统CFCC相比显著提升了语种识别准确率,在0 dB信噪比下语种识别准确率平均提升了11.1%,充分验证了所提算法的有效性和稳健性。

关键词: 语种识别, 自适应滤波, 分数阶小波变换, 神经网络, 耳蜗滤波器倒谱系数

Abstract:

Aiming at the problem of low language recognition rate under low signal-to-noise ratio, a language recognition method based on fractional wavelet transform was proposed.Firstly, the adaptive filtering algorithm was used to filter the noise of the noisy signal, so as to reduce the influence of noise on the feature extraction and improve the processing ability of the system for non-stationary signals.Secondly, the motion of the signal on the basilar membrane of the cochlea was simulated, and then the signal was compressed by a nonlinear power function.Finally, the improved CFCC were extracted by simulating the human hearing process.Experiments show that compared with the traditional CFCC, the language recognition rate is significantly improved, and the language recognition rate is increased by 11.1% on average under the 0 dB signal-to-noise ratio, which verifies the effectiveness and robustness of the proposed algorithm.

Key words: language recognition, adaptive filtering, fractional wavelet transform, neural network, cochlear filter cepstral coefficient

中图分类号: 

No Suggested Reading articles found!