网络与信息安全学报 ›› 2021, Vol. 7 ›› Issue (5): 105-112.doi: 10.11959/j.issn.2096-109x.2021041
郑洪浩, 于洪涛, 李邵梅
修回日期:
2020-12-25
出版日期:
2021-10-15
发布日期:
2021-10-01
作者简介:
郑洪浩(1992− ),男,山东济宁人,信息工程大学硕士生,主要研究方向为命名实体识别、关系抽取基金资助:
Honghao ZHENG, Hongtao YU, Shaomei LI
Revised:
2020-12-25
Online:
2021-10-15
Published:
2021-10-01
Supported by:
摘要:
为了提高中文命名实体识别的效果,提出了基于 XLNET-Transformer_P-CRF 模型的方法,该方法使用了 Transformer_P 编码器,改进了传统 Transformer 编码器不能获取相对位置信息的缺点。实验结果表明,XLNET-Transformer_P-CRF模型在MSRA、OntoNotes4.0、Resume、微博数据集4类数据集上分别达到95.11%、80.54%、96.70%、71.46%的F1值,均高于中文命名实体识别的主流模型。
中图分类号:
郑洪浩, 于洪涛, 李邵梅. 基于改进的Transformer编码器的中文命名实体识别[J]. 网络与信息安全学报, 2021, 7(5): 105-112.
Honghao ZHENG, Hongtao YU, Shaomei LI. Chinese NER based on improved Transformer encoder[J]. Chinese Journal of Network and Information Security, 2021, 7(5): 105-112.
表1
中文命名实体识别数据集详细信息Table 1 Detailed statistics about the Chinese named entity recognition datasets"
数据集 | 训练集 | 验证集 | 测试集 | |||||
句子 | 字符 | 句子 | 字符 | 句子 | 字符 | |||
MSRA | 464 00 | 2 169 900 | — | — | 4 400 | 172 600 | ||
OntoNotes4.0 | 15 700 | 491 900 | 4 300 | 200 500 | 4 300 | 208 100 | ||
Resume | 3 800 | 124 100 | 460 | 13 900 | 480 | 15 100 | ||
微博 | 1 400 | 73 500 | 270 | 14 500 | 270 | 14 800 |
[1] | GRISHMAN R , SUNDHEIM B . Message understanding conference-6:a brief history[C]// International Conference on Computational Linguistics, 1996: 466-471. |
[2] | PIZZATO L A , MOLLA D , PARIS C . Pseudo relevance feedback using named entities for question answering[C]// Proceedings of the 2006 Australian Language Technology Workshop (ALTW-2006). 2006: 89-90. |
[3] | BABYCH B , HARTLEY A . Improving machine translation quality with automatic named entity recognition[C]// Proceedings of the 7th International EAMT Workshop on MT and other Language Technology Tools,Improving MT Through other Language Technology Tools:Resources and Tools for Building MT,Association for Computational Linguistics. 2003: 1-8. |
[4] | MANDL T , WOMSER-HACKER C , . The effect of named entities on effectiveness in cross-language information retrieval evaluation[C]// Proceedings of the 2005 ACM Symposium on Applied Computing. 2005: 1059-1064. |
[5] | 邱锡鹏 . 神经网络与深度学习[M]. 北京: 机械工业出版社, 2020. |
QIU X P . Neural networks and deep learning[M]. Beijing: China Machine Press, 2020. | |
[6] | LI J , SUN A , HAN J ,et al. A survey on deep learning for named entity recognition[J]. IEEE Transactions on Knowledge and Data Engineering, 2020:1. |
[7] | HAMMERTON J , . Named entity recognition with long short-term memory[C]// North American Chapter of the Association for Computational Linguistics. 2003: 172-175. |
[8] | HUANG Z , XU W , YU K ,et al. Bidirectional LSTM-CRF models for sequence tagging[J]. arXiv:Computation and Language, 2015. |
[9] | MA X , HOVY E . End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF[J]. arXiv:Learning, 2016. |
[10] | CHIU J P , NICHOLS E . Named entity recognition with bidirectional LSTM-CNNs[J]. Transactions of the Association for Computational Linguistics, 2016,4(1): 357-370. |
[11] | BENGIO Y , SIMARD P Y , FRASCONI P ,et al. Learning long-term dependencies with gradient descent is difficult[J]. IEEE Transactions on Neural Networks, 1994,5(2): 157-166. |
[12] | VASWANI A , SHAZEER N , PARMAR N ,et al. Attention is all you need[C]// Neural Information Processing Systems. 2017: 5998-6008. |
[13] | GUO Q P , QIU X P , LIU P F ,et al. Star transformer[C]// NAACL. 2019: 1315-1325. |
[14] | DAI Z , YANG Z , YANG Y ,et al. Transformer-XL:attentive language models beyond a fixed-length context[J]. arXiv:Learning, 2019. |
[15] | HUANG A , VASWANI A , USZKOREIT J ,et al. Music transformer:generating music with long-term structure[C]// International Conference on Learning Representations, 2019. |
[16] | MIKOLOV T , CHEN K , CORRADO G S ,et al. Efficient estimation of word representations in vector space[C]// International Conference on Learning Representations. 2013. |
[17] | 杨飘, 董文永 . 基于BERT嵌入的中文命名实体识别方法[J]. 计算机工程, 2020,46(04): 40-45,52. |
YANG P , DONG W Y . Chinese NER based on BERT embedding[J]. Computer Engineering, 2020,46(4): 40-45,52. | |
[18] | DEVLIN J , CHANG M , LEE K ,et al. BERT:pre-training of deep bidirectional transformers for language understanding[J]. arXiv:Computation and Language, 2018. |
[19] | YANG Z , DAI Z , YANG Y ,et al. XLNet:generalized autoregressive pretraining for language understanding[J]. arXiv:Computation and Language, 2019. |
[20] | PARIKH A P , TACKSTROM O , DAS D ,et al. A decomposable attention model for natural language inference[C]// Empirical Methods in Natural Language Processing. 2016: 2249-2255. |
[21] | YAN H , DENG B , LI X ,et al. TENER:adapting transformer encoder for named entity recognition[J]. arXiv:Computation and Language, 2019. |
[22] | SHAW P , USZKOREIT J , VASWANI A ,et al. SELF-attention with relative position representations[C]// North American Chapter of the Association for Computational Linguistics. 2018: 464-468. |
[23] | LAFFERTY J , MCCALLUM A , PEREIRA F ,et al. Conditional random fields:probabilistic models for segmenting and Labeling Sequence Data[C]// International Conference on Machine Learning. 2001: 282-289. |
[24] | SMITH L N , . Cyclical learning rates for training neural networks[C]// Workshop on Applications of Computer Vision, 2017: 464-472. |
[25] | LEVOW G , . The third international chinese language processing Bakeoff:word segmentation and named entity recognition[C]// Meeting of the Association for Computational Linguistics. 2006: 108-117. |
[26] | Ralph Weischedel. Ontonotes release 4.0 LDC2011T03[S]. 2011. |
[27] | ZHANG Y , YANG J . Chinese NER using lattice LSTM[C]// Meeting of the Association for Computational Linguistics. 2018: 1554-1564. |
[28] | PENG N , DREDZE M . Named entity recognition for chinese social media with jointly trained embeddings[C]// Empirical Methods in Natural Language Processing. 2015: 548-554. |
[29] | CHEN A , PENG F , SHAN R ,et al. Chinese named entity recognition with conditional probabilistic models[C]// Meeting of the Association for Computational Linguistics. 2006: 173-176. |
[30] | ZHANG S , QIN Y , WEN J ,et al. Word segmentation and named entity recognition for SIGHAN Bakeoff3[C]// Meeting of the Association for Computational Linguistics. 2006: 158-161. |
[31] | ZHOU J S , QU W G , ZHANG F . Chinese named entity recognition via joint identification and categorization[J]. Chinese Journal of Electronics, 2013,22(2): 225-230. |
[32] | DONG C H , ZHANG J J , ZONG C Q . Character based LSTM-CRF with radical-level features for Chinese named entity recognition[C]// International Conference on Computer Processing of Oriental Languages. 2016: 239-250. |
[33] | SUI D , CHEN Y , LIU K ,et al. Leverage lexical knowledge for chinese named entity recognition via collaborative graph network[C]// International Joint Conference on Natural Language Processing. 2019: 3828-3838. |
[34] | QIU X P , LI X N , YAN H . Flat chinese ner using flat-lattice transformer[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020: 6836-6842. |
[35] | WANG M , CHE W , MANNING C D ,et al. Effective bilingual constraints for semi-supervised learning of named entity recognizers[C]// National Conference on Artificial Intelligence. 2013: 919-925. |
[36] | CHE W , WANG M , MANNING C D ,et al. Named entity recognition with bilingual constraints[C]// North American Chapter of the Association for Computational Linguistics. 2013: 52-62. |
[37] | YANG J , ZHANG Y , DONG F ,et al. Neural word segmentation with rich pretraining[C]// Meeting of the Association for Computational Linguistics. 2017: 839-849. |
[38] | LIU W , XU T , XU Q ,et al. An encoding strategy based word-character LSTM for chinese NER[C]// North American Chapter of the Association for Computational Linguistics. 2019: 2379-2389. |
[39] | GUI T , ZOU Y , ZHANG Q ,et al. A lexicon-based graph neural network for chinese NER[C]// International Joint Conference on Natural Language Processing. 2019: 1040-1050. |
[40] | HE H , SUN X . A unified model for cross-domain and semi- supervised named entity recognition in chinese social media[C]// National Conference on Artificial Intelligence. 2017: 3216-3222. |
[1] | 陈赛特, 李卫海, 姚远志, 俞能海. 轻量级K匿名增量近邻查询位置隐私保护算法[J]. 网络与信息安全学报, 2023, 9(3): 60-72. |
[2] | 陈万泽, 黄丽清, 陈家祯, 叶锋, 黄添强, 罗海峰. 融合小波快捷连接生成对抗网络的面部性别伪造[J]. 网络与信息安全学报, 2023, 9(3): 150-160. |
[3] | 王贺立, 闫巧. 基于交易记录特征的自私挖矿检测方案[J]. 网络与信息安全学报, 2023, 9(2): 104-114. |
[4] | 沈晓晨, 葛寅辉, 陈波, 于泠. 人工智能安全知识图谱构建技术研究[J]. 网络与信息安全学报, 2023, 9(2): 164-174. |
[5] | 李婧文, 李雅文. 深度合成技术应用与风险应对[J]. 网络与信息安全学报, 2023, 9(2): 184-190. |
[6] | 代龙, 张静, 樊雪峰, 周晓谊. 基于黑盒水印的NLP神经网络版权保护[J]. 网络与信息安全学报, 2023, 9(1): 140-149. |
[7] | 郑洪浩, 郝一诺, 于洪涛, 李邵梅, 吴翼腾. 基于改进Transformer的社交媒体谣言检测[J]. 网络与信息安全学报, 2022, 8(4): 168-174. |
[8] | 张宇, 李炳龙, 李学娟, 张和禹. 基于DSR和BGRU模型的聊天文本证据分类方法[J]. 网络与信息安全学报, 2022, 8(2): 150-159. |
[9] | 胡向东, 田正国. 融合注意力机制和BSRU的工业互联网安全态势预测方法[J]. 网络与信息安全学报, 2022, 8(1): 41-51. |
[10] | 王鹏程, 郑海斌, 邹健飞, 庞玲, 李虎, 陈晋音. 面向商用活体检测平台的鲁棒性评估[J]. 网络与信息安全学报, 2022, 8(1): 180-189. |
[11] | 王垚飞, 张卫明, 陈可江, 周文柏, 俞能海. 图像非加性隐写综述[J]. 网络与信息安全学报, 2021, 7(6): 1-10. |
[12] | 邱洁, 韩瑞, 魏志丰, 王志洋. 网络空间公共基础设施体系及安全策略研究[J]. 网络与信息安全学报, 2021, 7(6): 56-67. |
[13] | 程宇翔, 张卫明, 李伟祥, 俞能海. 基于分层嵌入的二值图像隐写方法[J]. 网络与信息安全学报, 2021, 7(5): 49-56. |
[14] | 杨潇,陈秀真,马进,梁浩喆,李生红. 基于用户兴趣的微博溯源算法[J]. 网络与信息安全学报, 2020, 6(6): 164-173. |
[15] | 蔡振华,林嘉韵,刘芳. 区块链存储:技术与挑战[J]. 网络与信息安全学报, 2020, 6(5): 11-20. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||
|