网络与信息安全学报 ›› 2023, Vol. 9 ›› Issue (5): 138-149.doi: 10.11959/j.issn.2096-109x.2023078

• 学术论文 • 上一篇    

基于词序扰动的神经机器翻译模型鲁棒性研究

赵彧然, 薛傥, 刘功申   

  1. 上海交通大学网络空间安全学院,上海 200240
  • 修回日期:2023-03-02 出版日期:2023-10-01 发布日期:2023-10-01
  • 作者简介:赵彧然(1998−),男,河南安阳人,上海交通大学硕士生,主要研究方向为自然语言处理
    薛傥(1999−),男,山西运城人,上海交通大学硕士生,主要研究方向为自然语言处理
    刘功申(1974−),男,山东聊城人,上海交通大学教授、博士生导师,主要研究方向为人工智能安全、自然语言处理、信息安全
  • 基金资助:
    国家自然科学基金(U21B2020);上海市科技计划项目(22511104400)

Research on the robustness of neural machine translation systems in word order perturbation

Yuran ZHAO, Tang XUE, Gongshen LIU   

  1. School of Cyber Science and Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
  • Revised:2023-03-02 Online:2023-10-01 Published:2023-10-01
  • Supported by:
    The National Natural Science Foundation of China(U21B2020);Shanghai Science and Technology Plan(22511104400)

摘要:

预训练语言模型是自然语言处理领域一类十分重要的模型,预训练-微调成为许多下游任务的标准范式。先前的研究表明,将BERT等预训练语言模型融合至神经机器翻译模型能改善其性能。但目前仍不清楚这部分性能提升的来源是更强的语义建模能力还是句法建模能力。此外,预训练语言模型的知识是否以及如何影响神经机器翻译模型的鲁棒性仍不得而知。为此,使用探针方法对两类神经翻译模型编码器的句法建模能力进行测试,发现融合预训模型的翻译模型能够更好地建模句子的词序。在此基础上,提出了基于词序扰动的攻击方法,检验神经机器翻译模型的鲁棒性。多个语言对上的测试结果表明,即使受到词序扰动攻击,融合BERT的神经机器翻译模型的表现基本上优于传统的神经机器翻译模型,证明预训练模型能够提升翻译模型的鲁棒性。但在英语-德语翻译任务中,融合预训练模型的翻译模型生成的译文质量反而更差,表明英语BERT将损害翻译模型的鲁棒性。进一步分析显示,融合英语BERT的翻译模型难以应对句子受到词序扰动攻击前后的语义差距,导致模型出现更多错误的复制行为以及低频词翻译错误。因此,预训练并不总能为下游任务带来提高,研究者应该根据任务特性考虑是否使用预训练模型。

关键词: 神经机器翻译, 预训练模型, 鲁棒性, 词序

Abstract:

Pre-trained language model is one of the most important models in the natural language processing field, as pre-train-finetune has become the paradigm in various NLP downstream tasks.Previous studies have proved integrating pre-trained language models (e.g., BERT) into neural machine translation (NMT) models can improve translation performance.However, it is still unclear whether these improvements stem from enhanced semantic or syntactic modeling capabilities, as well as how pre-trained knowledge impacts the robustness of the models.To address these questions, a systematic study was conducted to examine the syntactic ability of BERT-enhanced NMT models using probing tasks.The study revealed that the enhanced models showed proficiency in modeling word order, highlighting their syntactic modeling capabilities.In addition, an attacking method was proposed to evaluate the robustness of NMT models in handling word order.BERT-enhanced NMT models yielded better translation performance in most of the tasks, indicating that BERT can improve the robustness of NMT models.It was observed that BERT-enhanced NMT model generated poorer translations than vanilla NMT model after attacking in the English-German translation task, which meant that English BERT worsened model robustness in such a scenario.Further analyses revealed that English BERT failed to bridge the semantic gap between the original and perturbed sources, leading to more copying errors and errors in translating low-frequency words.These findings suggest that the benefits of pre-training may not always be consistent in downstream tasks, and careful consideration should be given to its usage.

Key words: neural machine translation, pre-training model, robustness, word order

中图分类号: 

No Suggested Reading articles found!