Telecommunications Science ›› 2023, Vol. 39 ›› Issue (2): 92-102.doi: 10.11959/j.issn.1000-0801.2023006

• Research and Development • Previous Articles     Next Articles

Spoof speech detection based on context information and attention feature

Jia CHEN1, Jianwu ZHANG1, Zheliang ZHANG2   

  1. 1 Hangzhou Dianzi University, Hangzhou 310018, China
    2 Zhejiang Uniview Technologies Co., Ltd., Hangzhou 310051, China
  • Revised:2023-01-05 Online:2023-02-20 Published:2023-02-01
  • Supported by:
    The National Natural Science Foundation of China(U1866209);The National Natural Science Foundation of China(61772162)

Abstract:

With the rapid development of speech synthesis and speech conversion technology, methods of spoof speech detection still have problems such as low spoof detection accuracy and poor generality.Therefore, an end-to-end spoof detection method based on context information and attention feature was proposed.Based on deep residual shrinkage network (DRSN), the proposed method used the dual-branch context information coordination fusion module (DCCM) to aggregate rich context information, and fused features based on coordinate time-frequency attention (CTFA) to obtain cross-dimensional interaction features with context information, thus maximizing the potential of capturing artifacts.Compared with the best baseline system, in the ASVspoof 2019 LA dataset, the proposed method had reduced the EER and t-DCF performance indicators by 68% and 65% respectively, in the ASVspoof 2021 LA dataset, the EER and t-DCF of the proposed method were 4.81 and 0.311 5 and dropped by 48% and 10% separately.The experimental results show that this method can effectively improve the accuracy and generalization ability of spoof speech detection.

Key words: spoof speech detection, context information, attention feature, end-to-end, artifacts

CLC Number: 

No Suggested Reading articles found!