Journal on Communications ›› 2022, Vol. 43 ›› Issue (11): 65-79.doi: 10.11959/j.issn.1000-436x.2022222

• Papers • Previous Articles     Next Articles

Chinese semantic and phonological information-based text proofreading model for speech recognition

Meiyu ZHONG1, Peiliang WU1,2, Yan DOU1,3, Yi LIU1, Lingfu KONG1,2   

  1. 1 School of Information Science and Engineering, Yanshan University, Qinhuangdao 066004, China
    2 The Key Laboratory for Computer Virtual Technology and System Integration of Hebei Province, Qinhuangdao 066004, China
    3 The Key Laboratory of Software Engineering of Hebei Province, Qinhuangdao 066004, China
  • Revised:2022-10-24 Online:2022-11-25 Published:2022-11-01
  • Supported by:
    The National Key Research and Development Program of China(2018YFB1308300);The National Natural Science Foundation of China(62276028);The National Natural Science Foundation of China(U20A20167);Beijing Natural Science Foundation(4202026);The Natural Science Foundation of Hebei Province(F202103079);The Innovation Capability Improvement Plan Project of Hebei Province(22567626H);The Project of the Key Laboratory of Software Engineering of Hebei Province(22567637H)

Abstract:

To study the influence of Chinese Pinyin on detecting and correcting text errors in speech recognition, a text proofreading model based on Chinese semantic and phonological information was proposed.Five Pinyin coding methods were designed to construct the character-Pinyin embedding vector that was employed as the input of the Seq2Seq model based on gated recurrent unit.At the same time, the attention mechanism was adopted to extract the Chinese semantic and phonological information of sentences to correct speech recognition errors.Aiming at the problem of insufficient labeled corpus, a data augmentation method was introduced, which could automatically obtain annotated corpora by exchanging the initials or finals of Chinese Pinyin.The experimental results on AISHELL-3’s public data show that phonological information is conducive to the text proofreading model to detect and correct text errors after speech recognition, and the proposed data augmentation method can improve the error detection performance of the model.

Key words: text proofreading, speech recognition, Pinyin, attention mechanism

CLC Number: 

No Suggested Reading articles found!