Journal on Communications ›› 2023, Vol. 44 ›› Issue (5): 64-78.doi: 10.11959/j.issn.1000-436x.2023070

• Papers • Previous Articles     Next Articles

End-to-end scene text detection and recognition algorithm based on Transformer decoders

Jinzhi ZHENG1,2, Ruyi JI1,2, Libo ZHANG1,3, Chen ZHAO1,3   

  1. 1 Intelligent Software Research Center, Institute of Software, Chinese Academy of Sciences, Beijing 100190, China
    2 University of Chinese Academy of Sciences, Beijing 100190, China
    3 State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences, Beijing 100190, China
  • Revised:2023-01-31 Online:2023-05-25 Published:2023-05-01

Abstract:

Aiming at the detection and recognition task of arbitrary shape text in scene, a novelty scene text detection and recognition algorithm which could be trained by end-to-end algorithm was proposed.Firstly, the detection branch of text aware module based on segmentation idea was introduced to detect scene text from visual features extracted by convolutional network.Then, a recognition branch based on Transformer vision module and Transformer language module encoded the text features of the detection results.Finally, the text features encoded by the fusion gate in the recognition branch were fused to output the scene text.The experimental results on the three benchmark datasets of Total-Text, ICDAR2013 and ICDAR2015 show that the proposed algorithm has excellent performance in recall, precision, F-score, and has certain advantages in efficiency.

Key words: text detection, text recognition, end-to-end, Transformer

CLC Number: 

No Suggested Reading articles found!