Chinese Journal of Intelligent Science and Technology ›› 2021, Vol. 3 ›› Issue (3): 351-358.doi: 10.11959/j.issn.2096-6652.202136

• Special Issue: Intelligent Object Detection and Recognition • Previous Articles     Next Articles

End-to-end speech enhancement based on ultra-lightweight channel attention

Yi HONG1, Chengli SUN1, Yan LENG2   

  1. 1 School of Information Engineering, Nanchang Hangkong University, Nanchang 330063, China
    2 School of Physics and Electronic, Shandong Normal University, Jinan 250014, China
  • Revised:2021-07-17 Online:2021-09-15 Published:2021-09-01
  • Supported by:
    The National Natural Science Foundation of China(61861033);The Key Project of Natural Science Foundation of Jiangxi Province(20202ACBL202007);The Natural Science Foundation of Shandong Province(ZR2020MF020)

Abstract:

The full convolutional time-domain audio separation network (Conv-TasNet) is a state-of-the-art end-to-end speech separation model which was proposed recently.The Conv-TasNet used dilated convolution to expand the receptive field and fuse more speech features in space, which greatly improved the speech separation performance of the network, but at the same time ignored the importance of information across different convolution channels.An end-to-end speech enhancement method based on ultra-lightweight channel attention was proposed, which effectively combined Conv-TasNet and channel attention.At the same time, a group of filters was added to the Conv-TasNet codec to improve the speech feature extraction ability of the network.This method can make convolutional neural network combine spatial information and channel information more effectively to improve the speech enhancement effect.Experiment shows that the proposed model can effectively improve the performance of speech enhancement when the model capacity is only increased by about 0.02%.

Key words: speech enhancement, end-to-end speech separation network, channel attention

CLC Number: 

No Suggested Reading articles found!