网络与信息安全学报 ›› 2021, Vol. 7 ›› Issue (5): 105-112.doi: 10.11959/j.issn.2096-109x.2021041

• 专栏Ⅱ:机器学习及安全应用 • 上一篇    

基于改进的Transformer编码器的中文命名实体识别

郑洪浩, 于洪涛, 李邵梅   

  1. 信息工程大学,河南 郑州 450002
  • 修回日期:2020-12-25 出版日期:2021-10-01 发布日期:2021-10-01
  • 作者简介:郑洪浩(1992− ),男,山东济宁人,信息工程大学硕士生,主要研究方向为命名实体识别、关系抽取
    于洪涛(1970− ),男,辽宁丹东人,博士,信息工程大学研究员,主要研究方向为大数据与人工智能
    李邵梅(1982− ),女,湖北钟祥人,博士,信息工程大学副研究员,主要研究方向为计算机视觉
  • 基金资助:
    国家自然基金青年基金(62002384);国家重点研发计划(2016QY03D0502);郑州市协同创新重大专项(162/32410218)

Chinese NER based on improved Transformer encoder

Honghao ZHENG, Hongtao YU, Shaomei LI   

  1. Information Engineering University, Zhengzhou 450002, China
  • Revised:2020-12-25 Online:2021-10-01 Published:2021-10-01
  • Supported by:
    The National Natural Science Foundation of China(62002384);The National Key R&D Program of China(2016QY03D0502);Major Collaborative Innovation Projects of Zhengzhou(162/32410218)

摘要:

为了提高中文命名实体识别的效果,提出了基于 XLNET-Transformer_P-CRF 模型的方法,该方法使用了 Transformer_P 编码器,改进了传统 Transformer 编码器不能获取相对位置信息的缺点。实验结果表明,XLNET-Transformer_P-CRF模型在MSRA、OntoNotes4.0、Resume、微博数据集4类数据集上分别达到95.11%、80.54%、96.70%、71.46%的F1值,均高于中文命名实体识别的主流模型。

关键词: 中文命名实体识别, Transformer编码器, 相对位置信息

Abstract:

In order to improve the effect of chinese named entity recognition, a method based on the XLNETTransformer_P-CRF model was proposed, which used the Transformer_P encoder, improved the shortcomings of the traditional Transformer encoder that couldn’t obtain relative position information.Experiments show that the XLNET-Transformer_P-CRF model achieves 95.11%, 80.54%, 96.70%, and 71.46% F1 values on the four types of data sets: MSRA, OntoNotes4.0, Resume, and Weibo, which are all higher than other mainstream chinese NER model.

Key words: Chinese named entity recognition, Transformer encoder, relative position information

中图分类号: 

No Suggested Reading articles found!