融合篇章结构位置编码的神经机器翻译

doi:10.11959/j.issn.2096-6652.202016

Abstract

Abstract:

Most of existing document-level neural machine translation (DocNMT) methods focus on exploring the utilization of the lexical information of context,which ignore the structural relationships among the cross-sentence discourse semantic units.Therefore,multiple discourse structural position encoding strategies were proposed to represent the positional relationships among the words in discourse units over the discourse tree based on rhetorical structure theory (RST).Experimental results show that the source-side discourse structural position information is effectively fused into the DocNMT models underlying the Transformer architecture by the position encoding,and the translation quality is improved significantly.

Key words: neural machine translation, discourse structure, position encoding, discourse analysis, rhetorical structure theory

CLC Number:

TP391

Xiaomian KANG,Chengqing ZONG. Fusion of discourse structural position encoding for neural machine translation[J]. Chinese Journal of Intelligent Science and Technology, 2020, 2(2): 144-152.

Figures/Tables 7

References 37

[1]	郑南宁 . 人工智能新时代[J]. 智能科学与技术学报, 2019,1(1): 1-3.
	ZHENG N N . The new era of artificial intelligence[J]. Chinese Journal of Intelligent Science and Technology, 2019,1(1): 1-3.
[2]	张钹 . 人工智能进入后深度学习时代[J]. 智能科学与技术学报, 2019,1(1): 4-6.
	ZHANG B . Artificial intelligence is entering the post deep-learning era[J]. Chinese Journal of f Intelligent Science And Technology, 2019,1(1): 4-6.
[3]	宗成庆 . 统计自然语言处理[M]. 北京: 清华大学出版社, 2013.
	ZONG C Q . Statistical natural language processing[M]. Beijing: Tsinghua University PressPress, 2013.
[4]	杜倩龙, 宗成庆, 苏克毅 . 利用上下文相似度增强词对齐效果的自然语言推理方法[J]. 智能科学与技术学报, 2020,2(1): 26-35.
	DU Q L , ZONG C Q , SU K Y . Enhancing alignment with context similarity for natural language inference[J]. Chinese Journal of Intelligent Science and Technology, 2020,2(1): 26-35.
[5]	BROWN G , BROWN G D , BROWN G R ,et al. Discourse analysis[M]. Cambridge: Cambridge University PressPress, 1983.
[6]	HALLIDAY M A K , HASAN R . Cohesion in English[M]. >[S.l.]. RoutledgePress, 2014.
[7]	GONG Z X , ZHANG M , ZHOU G D . Cache-based document-level statistical machine translation[C]// The Conference on Empirical Methods in Natural Language Processing. Stroudsburg:ACL, 2011: 909-919.
[8]	XIONG D , DING Y , ZHANG M ,et al. Lexical chain based cohesion models for document-level statistical machine translation[C]// The Conference on Empirical Methods in Natural Language Processing. Stroudsburg:ACL, 2013: 1563-1573.
[9]	TU M , ZHOU Y , ZONG C Q . Enhancing grammatical cohesion:generating transitional expressions for SMT[C]// The Annual Meeting of the Association for Computational Linguistics. Stroudsburg:ACL, 2014: 850-860.
[10]	HARDMEIER C . Discourse in statistical machine translation[D]. Uppsala:Uppsala University, 2014.
[11]	WANG L Y , TU Z P , WAY A ,et al. Exploiting cross-sentence context for neural machine translation[C]// The Conference on Empirical Methods in Natural Language Processing. Stroudsburg:ACL, 2017: 2826-2831.
[12]	VOITA E , SERDYUKOV P , SENNRICH R ,et al. Context-aware neural machine translation learns anaphora resolution[C]// The Annual Meeting of the Association for Computational Linguistics. Stroudsburg:ACL, 2019: 1264-1274.
[13]	ZHANG J C , LUAN H B , SUN M S ,et al. Improving the transformer translation model with document-level context[C]// The Conference on Empirical Methods in Natural Language Processing. Stroudsburg:ACL, 2018: 533-542.
[14]	MICULICICH L , RAM D , PAPPAS N ,et al. Document-level neural machine translation with hierarchical attention networks[C]// The Conference on Empirical Methods in Natural Language Processing. Stroudsburg:ACL, 2018: 2947-2954.
[15]	YANG Z X , ZHANG J C , MENG F D ,et al. Enhancing context modeling with a query-guided capsule network for document-level translation[C]// The Conference on Empirical Methods in Natural Language Processing. Stroudsburg:ACL, 2019: 1527-1537.
[16]	TU Z P , LIU Y , SHI S M ,et al. Learning to remember translation history with a continuous cache[J]. Transactions of the ACL, 2018,6: 407-420.
[17]	XIONG H , HE Z J , WU H ,et al. Modeling coherence for discourse neural machine translation[C]// The AAAI Conference on Artificial Intelligence. Palo Alto:AAAI Press, 2019: 7338-7345.
[18]	MARUF S , MARTINS A F T , HAFFARI G . Selective attention for context-aware neural machine translation[C]// TheConference of the North American Chapter of the ACL. Stroudsburg:ACL, 2019: 3092-3102.
[19]	BAWDEN R , SENNRICH R , BIRCH A ,et al. Evaluating discourse phenomena in neural machine translation[C]// The Conference of the North American Chapter of the ACL. Stroudsburg:ACL, 2018: 1304-1313.
[20]	VOITA E , SENNRICH R , TITOV I . When a good translation is wrong in context:Context-aware machine translation improves on deixis,ellipsis,and lexical cohesion[C]// The Annual Meeting of the Associationfor Computational Linguistics. Stroudsburg:ACL, 2019: 1198-1212.
[21]	KANG X M , ZONG C Q , XUE N W . A survey of discourse representations for Chinese discourse annotation[J]. ACM Transactions on Asian and Low-Resource Language Information Processing, 2019,18(3): 1-25.
[22]	ROTHWELL A D . Thematic progression as a functional resource in analysing texts[J]. Circulo de Linguistica Aplicada a la Communication, 2001,(5):2.
[23]	ASHER N , ALEX L . Logics of conversation[M]. Cambridge: Cambridge University PressPress, 2003.
[24]	MANN W C , THOMPSON S A . Rhetorical structure theory:toward a functional theory of text organization[J]. Text ＆ Talk, 1988,8(3): 243-281.
[25]	HERNAULT H , PRENDINGER H , DUVERLE D A . HILDA:adiscourse parser using support vector machine classification[J]. Dialogue＆ Discourse, 2010,1(3): 1-33.
[26]	FENG V W , HIRST G . Text-level discourse parsing with rich linguistic features[C]// The Annual Meeting of the Association for Computational Linguistics. Stroudsburg:ACL, 2012: 60-68.
[27]	JI Y F , EISENSTEIN J . Representation learning for text-level discourse parsing[C]// The Annual Meeting of the Association for Computational Linguistics. Stroudsburg:ACL, 2014: 13-24.
[28]	BHATIA P , JI Y F , EISENSTEIN J . Better document-level sentiment analysis from RST discourse parsing[C]// The Conference on Empirical Methods in Natural Language Processing. Stroudsburg:ACL, 2015: 2212-2218.
[29]	GERANI S , MEHDAD Y , CARENINI G ,et al. Abstractive summarization of product reviews using discourse structure[C]// The Conference on Empirical Methods in Natural Language Processing. Stroudsburg:ACL, 2014: 1602-1613.
[30]	GUZMAN F , JOTY S , LIUIS A ,et al. Using discourse structure improves machine translation evaluation[C]// The Annual Meeting of the Association for Computational Linguistics. Stroudsburg:ACL, 2014: 687-698.
[31]	TU M , ZHOU Y , ZONG C Q . A novel translation framework based on rhetorical structure theory[C]// The Annual Meeting of the Association for Computational Linguistics. Stroudsburg:ACL, 2013: 370-374.
[32]	VASWANI A , SHAZEER N , PARMAR N ,et al. Attention is all you need[C]// The 31st Annual Conference on Advances in Neural Information Processing Systems. Boston:MIT Press, 2017: 5998-6008.
[33]	BAHDANAU D , CHO K , BENGIO Y . Neural machine translation by jointly learning to align and translate[J]. arXiv:1409.0473, 2014
[34]	GEHRING J , AULI M , GRAGIER D ,et al. Convolutional sequence to sequence learning[C]// The 34th International Conference on Machine Learning. New York:ACM Press, 2017: 1243-1252.
[35]	SHAW P , USZKOREIT J , VASWANI A . Self-attention with relative position representations[C]// The Conference of the North American Chapter of the Association for Computational Linguistics. Stroudsburg:ACL, 2018: 464-468
[36]	WANG X , TU Z P , WANG L Y ,et al. Self-attention with structural position representations[C]// The Conference on Empirical Methods in Natural Language Processing. Stroudsburg:ACL, 2019: 1403-1409.
[37]	SENNRICH R , DADDOW B , BIRCH A . Neural machine translation of rare words with subword units[C]// The Annual Meeting of the Associationfor Computational Linguistics. Stroudsburg:ACL, 2016: 1715-1725.

Metrics

Recommended 0

No Suggested Reading articles found!

数据来源	句子数目/个	段落数目/段
英译中TED	0.23 M/0.9 K/3.9 K	15.3 K/59/261
英译德TED	0.21 M/0.9 K/3.4 K	13.7 K/60/230
英译德Europarl	1.67 M/3.6 K/5.1 K	156.5 K/330/477

融合方式	编号	模型	BLEU
无	1	Base	28.63%
	2	HierAtt	29.38%
加法融合	3	+Abs EDU-PE	29.52%
	4	+Rel EDU-PE	29.49%
	5	+Abs Depth-PE	29.68%
	6	+Rel Depth-PE	29.65%
	7	+Path-PE	29.77%
非线性融合	8	+Abs EDU-PE	29.48%
	9	+Rel EDU-PE	29.53%
	10	+Abs Depth-PE	29.76%
	11	+Rel Depth-PE	29.80%
	12	+Path-PE	29.89%

模型	英译中TED	英译德TED	英译德Europarl
Base	21.54%	28.44%	28.87%
HierAtt	22.29%	29.31%	29.76%
+Rel EDU-PE	22.41%	29.52%	29.95%
+Rel Depth-PE	22.82%*	29.61%	30.12%*
+Path-PE	22.78%*	29.83%*	30.17%*
+Rel EDU-PE+Rel Depth-PE	23.07%*	29.70%	30.21%*
+Rel EDU-PE+Rel Depth-PE+Path-PE	22.98%*	29.97%*	30.28%*

类型	FlatAtt	HierAtt
无篇章结构位置编码	22.06%	22.29%
融合篇章结构位置编码	23.05 %	22.98 %

Fusion of discourse structural position encoding for neural machine translation

RichHTML

PDF下载

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 7

References 37

Related Articles 15

Metrics

Recommended 0

[1]	Fengtao XIANG, Jiongming SU, Xueqiang Gu, Wanpeng ZHANG. Research on the spread and countermeasures of COVID-19 using agent-based modeling [J]. Chinese Journal of Intelligent Science and Technology, 0, (): 51-57.
[2]	Zhouyu GU, Yuecheng YU, Tiantian Zhe. Rapider-YOLOX: lightweight object detection network with high precision [J]. Chinese Journal of Intelligent Science and Technology, 2023, 5(1): 92-103.
[3]	Xiaofeng CONG, Jie GUI, Jun ZHANG. Underwater image enhancement network based on visual Transformer with multiple loss functions fusion [J]. Chinese Journal of Intelligent Science and Technology, 2022, 4(4): 522-532.
[4]	Hang YU, Yanwei FU, Boyan JIANG, Xiangyang XUE. A survey of image-based few-shot 3D reconstruction [J]. Chinese Journal of Intelligent Science and Technology, 2022, 4(4): 544-559.
[5]	Yan CHEN, Xueqin LUO, Wei LIANG, Yongfang XIE. Depression recognition based on emotional information fused with attentional mechanism [J]. Chinese Journal of Intelligent Science and Technology, 2022, 4(4): 600-609.
[6]	Yongqiang ZHANG, Meilin SONG, Tianhu LIU, Menghua MAN. Research on three frame difference gesture recognition method based on mixed bone features [J]. Chinese Journal of Intelligent Science and Technology, 2022, 4(4): 592-599.
[7]	Chao GUO, Yue LU, Xiao WANG, Da YI, Xiao WANG, Fei-Yue WANG. Architecture and key techniques of parallel creation through the fusion of human-cyber-physical intelligence in CPSS [J]. Chinese Journal of Intelligent Science and Technology, 2022, 4(3): 344-354.
[8]	Zhou YU, Jing BI, Haitao YUAN. A path planning method for complex naval battle field based on an improved DQN algorithm [J]. Chinese Journal of Intelligent Science and Technology, 2022, 4(3): 418-425.
[9]	Renwu LI, Lingxiao ZHANG, Lin GAO, Chunpeng LI, Hao JIANG. Category-level object pose estimation from depth point cloud [J]. Chinese Journal of Intelligent Science and Technology, 2022, 4(2): 246-254.
[10]	Mingyang CHEN, Wen ZHANG, Xiangnan CHEN, Hongting ZHOU, Huajun CHEN. Collective knowledge graph: meta knowledge transfer and federated graph reasoning [J]. Chinese Journal of Intelligent Science and Technology, 2022, 4(1): 55-64.
[11]	Yue YU, Xin LIU, Fangqing JIANG, Han ZHANG, Hui WANG, Wei ZENG. Exploration of the continual learning ability that supports the application ecological evolution of the large-scale pretraining Peng Cheng series open source models [J]. Chinese Journal of Intelligent Science and Technology, 2022, 4(1): 97-108.
[12]	Hang ZHAO, Sheng LIU, Kun LUO, Shichao CHEN, Linghui KONG, Fan JIA. Research on application of edge computing system based on KubeEdge [J]. Chinese Journal of Intelligent Science and Technology, 2022, 4(1): 118-128.
[13]	Hong SHAO, Mingkun ZHANG, Wencheng CUI. Classification method of dermoscopic image based on hierarchical convolution neural network [J]. Chinese Journal of Intelligent Science and Technology, 2021, 3(4): 474-481.
[14]	Jun DONG. Implicit knowledge learning:taking clinical simulation for example [J]. Chinese Journal of Intelligent Science and Technology, 2021, 3(4): 492-498.
[15]	Heyang WANG, Qiming YANG, Qi ZHU. Retinal multi-disease screening and recognition method based on deep convolution ensemble network [J]. Chinese Journal of Intelligent Science and Technology, 2021, 3(3): 259-267.