通信学报 ›› 2015, Vol. 36 ›› Issue (9): 47-54.doi: 10.11959/j.issn.1000-436x.2015241

• 学术论文 • 上一篇    下一篇

基于稀疏组LASSO约束的本征音子说话人自适应

屈丹,张文林   

  1. 信息工程大学 信息系统工程学院,河南 郑州 450000
  • 出版日期:2015-09-25 发布日期:2017-09-15
  • 基金资助:
    国家自然科学基金资助项目;国家自然科学基金资助项目;国家自然科学基金资助项目

Sparse group LASSO constraint eigenphone speaker adaptation method for speech recognition

Dan QU,Wen-lin ZHANG   

  1. Institute of Information System Engineering,PLA Information Engineering University,Zhengzhou 450000,China
  • Online:2015-09-25 Published:2017-09-15
  • Supported by:
    The National Natural Science Foundation of China;The National Natural Science Foundation of China;The National Natural Science Foundation of China

摘要:

本征音子说话人自适应方法在自适应数据量不足时会出现严重的过拟合现象,提出了一种基于稀疏组LASSO 约束的本征音子说话人自适应算法。首先给出隐马尔可夫—高斯混合模型下本征音子说话人自适应的基本原理;然后将稀疏组 LASSO 正则化引入到本征音子说话人自适应,通过调整权重因子控制模型的复杂度,并通过一种加速近点梯度的数学优化算法来实现;最后将稀疏组 LASSO 约束的自适应算法与当前多种正则化约束的自适应方法进行比较。汉语连续语音识别的说话人自适应实验表明,引入稀疏组 LASSO 约束后,本征音子说话人自适应方法的性能得到了明显提高,且稀疏组LASSO约束方法优于l1、l2和弹性网正则化方法。

关键词: 说话人自适应, 本征音子, 组稀疏约束, 稀疏组LASSO约束, 近点梯度法

Abstract:

Original eigenphone speaker adaptation method performed well when the amount of adaptation data was suffi-cient.However,it suffered from server overfitting when insufficient amount of adaptation data was provided.A sparse group LASSO(SGL) constraint eigenphone speaker adaptation method was proposed.Firstly,the principle of eigenphone speaker adaptation was introduced in case of hidden Markov model-Gaussian mixture model (HMM-GMM) based speech recognition system.Then,a sparse group LASSO was applied to estimation of the eigenphone matrix.The weight of the SGL norm was adjusted to control the complexity of the adaptation model.Finally,an accelerated proximal gradient method was adopted to solve the mathematic optimization.The method was compared with up-to-date norm algorithms.Experiments on an mandarin Chinese continuous speech recognition task show that,the performance of the SGL con-straint eigenphone method can improve remarkably the performance of the system than original eigenphone method,and is also superior to l1、l2-norm and elastic net constraint methods.

Key words: speaker adaptation, eigenphone, group sparse constraint, sparse group LASSO constraint, proximal gradient method

No Suggested Reading articles found!