智能科学与技术学报 ›› 2020, Vol. 2 ›› Issue (2): 144-152.doi: 10.11959/j.issn.2096-6652.202016
修回日期:
2020-04-01
出版日期:
2020-06-20
发布日期:
2020-07-14
作者简介:
亢晓勉(1991- ),男,中国科学院自动化研究所模式识别国家重点实验室博士生,主要研究方向为机器翻译、篇章分析|宗成庆(1963- ),男,博士,中国科学院自动化研究所模式识别国家重点实验室研究员、博士生导师,主要研究方向为机器翻译、自然语言处理和文本数据挖掘等
基金资助:
Xiaomian KANG1,2,Chengqing ZONG1,2()
Revised:
2020-04-01
Online:
2020-06-20
Published:
2020-07-14
Supported by:
摘要:
现有的文档级神经机器翻译方法在翻译一个句子时大多只利用文档的上下文词汇信息,而忽视了跨句子的篇章语义单元之间的结构关系。针对此问题,提出了多种篇章结构位置编码策略,利用基于修辞结构理论的篇章树结构,对篇章树上位于不同篇章单元的单词之间的位置关系进行了表示。实验表明,通过位置编码的方式,在基于Transformer框架的神经机器翻译模型中有效地融合了源端的篇章结构信息,译文质量得到了显著提升。
中图分类号:
亢晓勉,宗成庆. 融合篇章结构位置编码的神经机器翻译[J]. 智能科学与技术学报, 2020, 2(2): 144-152.
Xiaomian KANG,Chengqing ZONG. Fusion of discourse structural position encoding for neural machine translation[J]. Chinese Journal of Intelligent Science and Technology, 2020, 2(2): 144-152.
表3
在HierAtt模型上运用非线性融合方式加入篇章结构位置编码后的BLEU值"
模型 | 英译中TED | 英译德TED | 英译德Europarl |
Base | 21.54% | 28.44% | 28.87% |
HierAtt | 22.29% | 29.31% | 29.76% |
+Rel EDU-PE | 22.41% | 29.52% | 29.95% |
+Rel Depth-PE | 22.82%* | 29.61% | 30.12%* |
+Path-PE | 22.78%* | 29.83%* | 30.17%* |
+Rel EDU-PE+Rel Depth-PE | 23.07%* | 29.70% | 30.21%* |
+Rel EDU-PE+Rel Depth-PE+Path-PE | 22.98%* | 29.97%* | 30.28%* |
[1] | 郑南宁 . 人工智能新时代[J]. 智能科学与技术学报, 2019,1(1): 1-3. |
ZHENG N N . The new era of artificial intelligence[J]. Chinese Journal of Intelligent Science and Technology, 2019,1(1): 1-3. | |
[2] | 张钹 . 人工智能进入后深度学习时代[J]. 智能科学与技术学报, 2019,1(1): 4-6. |
ZHANG B . Artificial intelligence is entering the post deep-learning era[J]. Chinese Journal of f Intelligent Science And Technology, 2019,1(1): 4-6. | |
[3] | 宗成庆 . 统计自然语言处理[M]. 北京: 清华大学出版社, 2013. |
ZONG C Q . Statistical natural language processing[M]. Beijing: Tsinghua University PressPress, 2013. | |
[4] | 杜倩龙, 宗成庆, 苏克毅 . 利用上下文相似度增强词对齐效果的自然语言推理方法[J]. 智能科学与技术学报, 2020,2(1): 26-35. |
DU Q L , ZONG C Q , SU K Y . Enhancing alignment with context similarity for natural language inference[J]. Chinese Journal of Intelligent Science and Technology, 2020,2(1): 26-35. | |
[5] | BROWN G , BROWN G D , BROWN G R ,et al. Discourse analysis[M]. Cambridge: Cambridge University PressPress, 1983. |
[6] | HALLIDAY M A K , HASAN R . Cohesion in English[M]. >[S.l.]. RoutledgePress, 2014. |
[7] | GONG Z X , ZHANG M , ZHOU G D . Cache-based document-level statistical machine translation[C]// The Conference on Empirical Methods in Natural Language Processing. Stroudsburg:ACL, 2011: 909-919. |
[8] | XIONG D , DING Y , ZHANG M ,et al. Lexical chain based cohesion models for document-level statistical machine translation[C]// The Conference on Empirical Methods in Natural Language Processing. Stroudsburg:ACL, 2013: 1563-1573. |
[9] | TU M , ZHOU Y , ZONG C Q . Enhancing grammatical cohesion:generating transitional expressions for SMT[C]// The Annual Meeting of the Association for Computational Linguistics. Stroudsburg:ACL, 2014: 850-860. |
[10] | HARDMEIER C . Discourse in statistical machine translation[D]. Uppsala:Uppsala University, 2014. |
[11] | WANG L Y , TU Z P , WAY A ,et al. Exploiting cross-sentence context for neural machine translation[C]// The Conference on Empirical Methods in Natural Language Processing. Stroudsburg:ACL, 2017: 2826-2831. |
[12] | VOITA E , SERDYUKOV P , SENNRICH R ,et al. Context-aware neural machine translation learns anaphora resolution[C]// The Annual Meeting of the Association for Computational Linguistics. Stroudsburg:ACL, 2019: 1264-1274. |
[13] | ZHANG J C , LUAN H B , SUN M S ,et al. Improving the transformer translation model with document-level context[C]// The Conference on Empirical Methods in Natural Language Processing. Stroudsburg:ACL, 2018: 533-542. |
[14] | MICULICICH L , RAM D , PAPPAS N ,et al. Document-level neural machine translation with hierarchical attention networks[C]// The Conference on Empirical Methods in Natural Language Processing. Stroudsburg:ACL, 2018: 2947-2954. |
[15] | YANG Z X , ZHANG J C , MENG F D ,et al. Enhancing context modeling with a query-guided capsule network for document-level translation[C]// The Conference on Empirical Methods in Natural Language Processing. Stroudsburg:ACL, 2019: 1527-1537. |
[16] | TU Z P , LIU Y , SHI S M ,et al. Learning to remember translation history with a continuous cache[J]. Transactions of the ACL, 2018,6: 407-420. |
[17] | XIONG H , HE Z J , WU H ,et al. Modeling coherence for discourse neural machine translation[C]// The AAAI Conference on Artificial Intelligence. Palo Alto:AAAI Press, 2019: 7338-7345. |
[18] | MARUF S , MARTINS A F T , HAFFARI G . Selective attention for context-aware neural machine translation[C]// TheConference of the North American Chapter of the ACL. Stroudsburg:ACL, 2019: 3092-3102. |
[19] | BAWDEN R , SENNRICH R , BIRCH A ,et al. Evaluating discourse phenomena in neural machine translation[C]// The Conference of the North American Chapter of the ACL. Stroudsburg:ACL, 2018: 1304-1313. |
[20] | VOITA E , SENNRICH R , TITOV I . When a good translation is wrong in context:Context-aware machine translation improves on deixis,ellipsis,and lexical cohesion[C]// The Annual Meeting of the Associationfor Computational Linguistics. Stroudsburg:ACL, 2019: 1198-1212. |
[21] | KANG X M , ZONG C Q , XUE N W . A survey of discourse representations for Chinese discourse annotation[J]. ACM Transactions on Asian and Low-Resource Language Information Processing, 2019,18(3): 1-25. |
[22] | ROTHWELL A D . Thematic progression as a functional resource in analysing texts[J]. Circulo de Linguistica Aplicada a la Communication, 2001,(5):2. |
[23] | ASHER N , ALEX L . Logics of conversation[M]. Cambridge: Cambridge University PressPress, 2003. |
[24] | MANN W C , THOMPSON S A . Rhetorical structure theory:toward a functional theory of text organization[J]. Text & Talk, 1988,8(3): 243-281. |
[25] | HERNAULT H , PRENDINGER H , DUVERLE D A . HILDA:adiscourse parser using support vector machine classification[J]. Dialogue& Discourse, 2010,1(3): 1-33. |
[26] | FENG V W , HIRST G . Text-level discourse parsing with rich linguistic features[C]// The Annual Meeting of the Association for Computational Linguistics. Stroudsburg:ACL, 2012: 60-68. |
[27] | JI Y F , EISENSTEIN J . Representation learning for text-level discourse parsing[C]// The Annual Meeting of the Association for Computational Linguistics. Stroudsburg:ACL, 2014: 13-24. |
[28] | BHATIA P , JI Y F , EISENSTEIN J . Better document-level sentiment analysis from RST discourse parsing[C]// The Conference on Empirical Methods in Natural Language Processing. Stroudsburg:ACL, 2015: 2212-2218. |
[29] | GERANI S , MEHDAD Y , CARENINI G ,et al. Abstractive summarization of product reviews using discourse structure[C]// The Conference on Empirical Methods in Natural Language Processing. Stroudsburg:ACL, 2014: 1602-1613. |
[30] | GUZMAN F , JOTY S , LIUIS A ,et al. Using discourse structure improves machine translation evaluation[C]// The Annual Meeting of the Association for Computational Linguistics. Stroudsburg:ACL, 2014: 687-698. |
[31] | TU M , ZHOU Y , ZONG C Q . A novel translation framework based on rhetorical structure theory[C]// The Annual Meeting of the Association for Computational Linguistics. Stroudsburg:ACL, 2013: 370-374. |
[32] | VASWANI A , SHAZEER N , PARMAR N ,et al. Attention is all you need[C]// The 31st Annual Conference on Advances in Neural Information Processing Systems. Boston:MIT Press, 2017: 5998-6008. |
[33] | BAHDANAU D , CHO K , BENGIO Y . Neural machine translation by jointly learning to align and translate[J]. arXiv:1409.0473, 2014 |
[34] | GEHRING J , AULI M , GRAGIER D ,et al. Convolutional sequence to sequence learning[C]// The 34th International Conference on Machine Learning. New York:ACM Press, 2017: 1243-1252. |
[35] | SHAW P , USZKOREIT J , VASWANI A . Self-attention with relative position representations[C]// The Conference of the North American Chapter of the Association for Computational Linguistics. Stroudsburg:ACL, 2018: 464-468 |
[36] | WANG X , TU Z P , WANG L Y ,et al. Self-attention with structural position representations[C]// The Conference on Empirical Methods in Natural Language Processing. Stroudsburg:ACL, 2019: 1403-1409. |
[37] | SENNRICH R , DADDOW B , BIRCH A . Neural machine translation of rare words with subword units[C]// The Annual Meeting of the Associationfor Computational Linguistics. Stroudsburg:ACL, 2016: 1715-1725. |
[1] | 项凤涛, 苏炯铭, 谷学强, 张万鹏. 基于智能体建模的新冠肺炎疫情传播问题研究[J]. 智能科学与技术学报, 0, (): 51-57. |
[2] | 顾宙瑜, 於跃成, 者甜甜. Rapider-YOLOX:高效的轻量级目标检测网络[J]. 智能科学与技术学报, 2023, 5(1): 92-103. |
[3] | 丛晓峰, 桂杰, 章军. 基于视觉Transformer的多损失融合水下图像增强网络[J]. 智能科学与技术学报, 2022, 4(4): 522-532. |
[4] | 于航, 付彦伟, 姜柏言, 薛向阳. 基于少量图像的三维重建综述[J]. 智能科学与技术学报, 2022, 4(4): 544-559. |
[5] | 陈妍, 罗雪琴, 梁伟, 谢永芳. 基于情感信息融合注意力机制的抑郁症识别[J]. 智能科学与技术学报, 2022, 4(4): 600-609. |
[6] | 张永强, 宋美霖, 刘天虎, 满梦华. 混合骨骼特征的三帧间差分手势识别方法研究[J]. 智能科学与技术学报, 2022, 4(4): 592-599. |
[7] | 郭超, 鲁越, 王晓, 易达, 王虓, 王飞跃. 人机物CPSS智能融合的平行创作架构与关键技术研究[J]. 智能科学与技术学报, 2022, 4(3): 344-354. |
[8] | 郁洲, 毕敬, 苑海涛. 基于改进DQN算法的复杂海战场路径规划方法[J]. 智能科学与技术学报, 2022, 4(3): 418-425. |
[9] | 栗仁武, 张凌霄, 高林, 李淳芃, 蒋浩. 基于点云的类级别物体姿态估计[J]. 智能科学与技术学报, 2022, 4(2): 246-254. |
[10] | 陈名杨, 张文, 陈湘楠, 周虹廷, 陈华钧. 群体知识图谱:分布式知识迁移与联邦式图谱推理[J]. 智能科学与技术学报, 2022, 4(1): 55-64. |
[11] | 余跃, 刘欣, 蒋芳清, 张晗, 王晖, 曾炜. 支持鹏程系列开源大模型应用生态演化的可持续学习能力探索[J]. 智能科学与技术学报, 2022, 4(1): 97-108. |
[12] | 赵航, 刘胜, 罗坤, 陈世超, 孔令辉, 贾凡. 面向KubeEdge边缘计算系统应用研究[J]. 智能科学与技术学报, 2022, 4(1): 118-128. |
[13] | 邵虹, 张鸣坤, 崔文成. 基于分层卷积神经网络的皮肤镜图像分类方法[J]. 智能科学与技术学报, 2021, 3(4): 474-481. |
[14] | 董军. 隐性知识学习——以临床模拟为例[J]. 智能科学与技术学报, 2021, 3(4): 492-498. |
[15] | 王禾扬, 杨启鸣, 朱旗. 基于深度卷积集成网络的视网膜多种疾病筛查和识别方法[J]. 智能科学与技术学报, 2021, 3(3): 259-267. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||
|