面向低资源场景的实体知识获取研究综述

doi:10.11959/j.issn.2096-0271.2023079

摘要/Abstract

摘要：

实体获取是信息抽取的核心任务。近年来，在大数据训练模型的趋势下，深度学习在实体获取任务上取得了成功。但在自然环境等领域中，地形、灾害等类型的实体样本或者标注样本很少，而且对无标签样本进行标注又耗时费力。因此，面向低资源场景的实体获取逐渐受到关注，该任务被称作低资源实体获取或小样本实体获取。系统地梳理了当前低资源实体获取的相关工作，具体来说介绍了基于元学习、基于多任务学习和基于提示学习这3类方法的研究现状；总结了目前常用的低资源实体获取数据集和代表性模型在这些数据集上的实验结果；对低资源实体获取的方法进行了总结与分析；总结了低资源实体获取的挑战，并展望了未来发展方向。

关键词: 实体获取, 低资源场景, 小样本学习

Abstract:

Entity extraction is an essential task in information extraction.In recent years, under the trend of training model with big data, deep learning has achieved success in entity extraction.However, in the fields such as natural environment, there are very few entity samples or labeled samples of terrain, disasters and other types, and labeling those unlabeled samples is time-consuming and laborious.Therefore, entity extraction for low-resource scenarios has gradually attracted more and more attention, which is called low-resource entity extraction or few-shot entity extraction.This paper systematically combs the current approaches of low-resource entity extraction.It introduces the research status of three types of methods: metalearning based, multi-task learning based, and prompt learning based.Next, the paper summarizes the low-resource entity extraction datasets and the experimental results of the representative models on these datasets.In the following, the current low-resource entity extraction approaches are analysed.Finally, this paper summarizes the challenges of low-resource entity extraction and discusses the future research direction in this field.

Key words: entity extraction, low-resource scenarios, few-shot learning

中图分类号:

TP18

徐道柱, 赵凯琳, 康栋, 马超, 冯禹铭, 李紫宣, 弋步荣, 靳小龙. 面向低资源场景的实体知识获取研究综述[J]. 大数据, 2024, 10(1): 46-61.

Daozhu XU, Kailin ZHAO, Dong KANG, Chao MA, Yuming FENG, Zixuan LI, Burong YI, Xiaolong JIN. Survey on entity extraction for lowresource scenarios[J]. Big Data Research, 2024, 10(1): 46-61.

图/表 10

图1

图2

图3

图4

图5

表1

表2

表3

表4

表5

参考文献 34

[1]	HOSPEDALES T , ANTONIOU A , MICAELLI P ,et al. Meta-learning in neural networks:a survey[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021,44(9): 5149-5169.
[2]	FRITZLER A , LOGACHEVA V , KRETOV M . Few-shot classification in named entity recognition task[C]// Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing. New York:ACM, 2019: 993-1000.
[3]	SNELL J , SWERSKY K , ZEMEL R . Prototypical networks for few-shot learning[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. New York:ACM, 2017: 4080-4090.
[4]	YANG Y , KATIYAR A . Simple and effective few-shot named entity recognition with structured nearest neighbor learning[C]// Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Stroudsburg:Association for Computational Linguistics, 2020: 6365-6375.
[5]	HOU Y T , CHE W X , LAI Y K ,et al. Few-shot slot tagging with collapsed dependency transfer and label-enhanced task-adaptive projection network[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg:Association for Computational Linguistics, 2020: 1381-1393.
[6]	LAFFERTY J , MCCALLUM A , PEREIRA F C N . Conditional random fields:probabilistic models for segmenting and labeling sequence data[C]// Proceedings of the 18th International Conference on Machine Learning. New York:ACM, 2001: 282-289.
[7]	DAS S S S , KATIYAR A , PASSONNEAU R ,et al. CONTaiNER:few-shot named entity recognition via contrastive learning[C]// Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1:Long Papers). Stroudsburg:Association for Computational Linguistics, 2022: 6338-6353.
[8]	WANG J N , WANG C Y , TAN C Q ,et al. SpanProto:a two-stage spanbased prototypical network for fewshot named entity recognition[C]// Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. Stroudsburg:Association for Computational Linguistics, 2022: 3466-3476.
[9]	LI J , CHIU B , FENG S S ,et al. Fewshot named entity recognition via metalearning[J]. IEEE Transactions on Knowledge and Data Engineering, 2022,34(9): 4245-4256.
[10]	LI J , SHANG S , SHAO L . MetaNER:named entity recognition with metalearning[C]// Proceedings of The Web Conference 2020. New York:ACM, 2020: 429-440.
[11]	FINN C , ABBEEL P , LEVINE S . Modelagnostic meta-learning for fast adaptation of deep networks[C]// Proceedings of the 34th International Conference on Machine Learning. New York:ACM, 2017: 1126-1135.
[12]	ZHANG T , XIA C Y , LU C T ,et al. MZET:memory augmented zero-shot fine-grained named entity typing[C]// Proceedings of the 28th International Conference on Computational Linguistics. Stroudsburg:International Committee on Computational Linguistics, 2020: 77-87.
[13]	JI B , LI S , GAN S ,et al. Few-shot named entity recognition with entitylevel prototypical network enhanced by dispersedly distributed prototypes[EB]. arXiv preprint, 2022,arXiv:2208.08023.
[14]	WEN W , LIU Y B , LIN Q ,et al. Fewshot named entity recognition with joint token and sentence awareness[J]. Data Intelligence, 2023,5(3): 767-785.
[15]	WANG P Y , XU R X , LIU T Y ,et al. An enhanced span-based decomposition method for few-shot sequence labeling[C]// Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. Stroudsburg:Association for Computational Linguistics, 2022: 5012-5024.
[16]	MA T T , JIANG H Q , WU Q H ,et al. Decomposed meta-learning for few-shot named entity recognition[C]// Proceedings of Findings of the Association for Computational Linguistics:ACL 2022. Stroudsburg:Association for Computational Linguistics, 2022: 1584-1596.
[17]	HUANG H N , FENG Y M , JIN X L ,et al. DFS-NER:description enhanced fewshot NER via prompt learning and metalearning[C]// Proceedings of 2022 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT). Piscataway:IEEE Press, 2023: 796-803.
[18]	BAPNA A,TüR G , HAKKANI-TüR D ,et al. Towards zero-shot frame semantic parsing for domain scaling[C]// Proceedings of Interspeech 2017. ISCA:ISCA, 2017: 2476-2480.
[19]	LEE S , JHA R . Zero-shot adaptive transfer for conversational language understanding[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2019,33(1): 6642-6649.
[20]	SHAH D , GUPTA R , FAYAZI A ,et al. Robust zero-shot cross-domain slot filling with example values[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg:Association for Computational Linguistics, 2019: 5484-5490.
[21]	LIU Z H , WINATA G I , FUNG P . Zeroresource cross-domain named entity recognition[C]// Proceedings of the 5th Workshop on Representation Learning for NLP. Stroudsburg:Association for Computational Linguistics, 2020: 1-6.
[22]	ZHANG Y , YANG Q . A survey on multitask learning[J]. IEEE Transactions on Knowledge and Data Engineering, 2022,34(12): 5586-5609.
[23]	WEI J , BOSMA M , ZHAO V Y ,et al. Finetuned language models are zero-shot learners[EB]. arXiv preprint, 2021,arXiv:2109.01652.
[24]	ZHANG N , LI L , CHEN X ,et al. Differentiable prompt makes pretrained language models better few-shot learners[EB]. arXiv preprint, 2021,arXiv:2108.13161.
[25]	CUI L Y , WU Y , LIU J ,et al. Templatebased named entity recognition using BART[C]// Proceedings of Findings of the Association for Computational Linguistics:ACL-IJCNLP 2021. Stroudsburg:Association for Computational Linguistics, 2021: 1835-1845.
[26]	MA R T , ZHOU X , GUI T ,et al. Template-free prompt tuning for few-shot NER[C]// Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. Stroudsburg:Association for Computational Linguistics, 2022: 5721-5732.
[27]	HUANG Y , HE K , WANG Y ,et al. Copner:contrastive learning with prompt guiding for few-shot named entity recognition[C]// Proceedings of the 29th International conference on computational linguistics. Gyeongju:International Committee on Computational Linguistics, 2022: 2515-2527.
[28]	SUN Y , ZHENG Y , HAO C ,et al. NSPBERT:a prompt-based zero-shot learner through an original pre-training task:next sentence prediction[EB]. arXiv preprint, 2021,arXiv:2109.03564.
[29]	LI D , HU B , CHEN Q ,et al. Prompt-based text entailment for low-resource named entity recognition[EB]. arXiv preprint, 2022,arXiv:2211.03039.
[30]	LIU A T , XIAO W , ZHU H ,et al. QaNER:prompting question answering models for few-shot named entity recognition[EB]. arXiv preprint, 2022,arXiv:2203.01543.
[31]	COUCKE A , SAADE A , BALL A ,et al. Snips voice platform:an embedded spoken language understanding system for private-by-design voice interfaces[EB]. arXiv preprint, 2018,arXiv:1805.10190.
[32]	LIU Z H , XU Y , YU T Z ,et al. CrossNER:evaluating cross-domain named entity recognition[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2021,35(15): 13452-13460.
[33]	DING N , XU G W , CHEN Y L ,et al. Few-NERD:a few-shot named entity recognition dataset[C]// Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1:Long Papers). Stroudsburg:Association for Computational Linguistics, 2021: 3198-3213.
[34]	VAN DIS E A M , BOLLEN J , ZUIDEMA W ,et al. ChatGPT:five priorities for research[J]. Nature, 2023,614(7947): 224-226.

设置	模型	We	Mu	P1	Bo	Se	Re	Cr	平均F1值
1-shot	L-TapNet+CDT^[5]	71.53%	60.56%	66.27%	84.54%	76.27%	70.79%	62.89%	70.41%
	ESD^[15]	78.25%	54.74%	71.15%	71.45%	67.85%	71.52%	78.14%	70.44%
	DFS-NER^[17]	77.61%	65.21%	72.39%	89.31%	78.11%	72.65%	67.50%	74.68%
5-shot	L-TapNet+CDT^[5]	71.64%	67.16%	75.88%	84.38%	82.58%	70.05%	73.41%	75.01%
	ESD^[15]	84.50%	66.61%	79.69%	82.57%	82.22%	80.44%	81.13%	79.59%
	DFS-NER^[17]	80.42%	76.81%	84.52%	90.02%	86.79%	78.32%	84.81%	83.10%

设置	模型	CoNLL-03	GUM	WNUT-17	OntoNotes 5.0	平均F1值
1-shot	L-TapNet+CDT^[5]	44.30%	12.04%	20.80%	15.17%	23.08%
	DecomMetaNER^[16]	46.09%	17.54%	25.14%	34.13%	30.73%
	SpanProto^[8]	47.70%	19.92%	28.31%	36.41%	33.09%
5-shot	L-TapNet+CDT^[5]	45.35%	11.65%	23.30%	20.95%	25.31%
	DecomMetaNER^[16]	58.18%	31.36%	31.02%	45.55%	41.53%
	SpanProto^[8]	61.88%	35.12%	33.94%	48.21%	44.79%

模型	INTRA(5-way)	INTRA(10-way)	INTER(5-way)	INTER(10-way)
Proto^[2]	20.76%	15.05%	38.83%	32.45%
NNShot^[4]	25.78%	18.27%	47.24%	38.87%
StructShot^[4]	30.21%	21.03%	51.88%	43.34%
L-TapNet+CDT^[5]	25.81%	18.02%	41.44%	36.80%
DFS-NER^[17]	35.41%	20.31%	48.03%	34.38%
CONTaiNER^[7]	40.43%	33.84%	55.95%	48.35%
ESD^[15]	36.08%	30.00%	59.29%	52.16%
DecomMetaNER^[16]	49.48%	42.84%	64.75%	58.65%
SpanProto^[8]	54.49%	45.39%	73.36%	66.26%

模型	INTRA(3-way)	INTRA(5-way)	INTER(3-way)	INTER(5-way)
Proto^[2]	20.50%	9.91%	50.55%	48.23%
NNshot^[4]	17.99%	7.26%	67.97%	65.44%
StructShot^[4]	19.46%	16.25%	67.94%	63.90%
DecomMetaNER^[16]	21.31%	17.90%	75.32%	66.67%
ESD^[15]	25.48%	20.65%	74.21%	69.04%

方法	优点	缺点
基于元学习的方法	不容易在目标域上过拟合	性能不稳定；
		严重依赖目标域标注样本
基于多任务学习的方法	模型泛化表征能力强	依赖于辅助任务的选择与质量；
		难以平衡不同辅助任务对主任务的影响程度
基于提示学习的方法	对标注数据依赖低	监督信号有限；
		严重依赖于提示的选择