首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于跨语种预训练语言模型XLM-R的神经机器翻译方法
引用本文:王倩,李茂西,吴水秀,王明文.基于跨语种预训练语言模型XLM-R的神经机器翻译方法[J].北京大学学报(自然科学版),2022,58(1):29-36.
作者姓名:王倩  李茂西  吴水秀  王明文
作者单位:江西师范大学计算机信息工程学院, 南昌 330022
基金项目:国家自然科学基金(61662031)资助;
摘    要:探索将XLM-R跨语种预训练语言模型应用在神经机器翻译的源语言端、目标语言端和两端,提高机器翻译的质量.提出3种网络模型,分别在Transformer神经网络模型的编码器、解码器以及两端同时引入预训练的XLM-R多语种词语表示.在WMT英语-德语、IWSLT英语-葡萄牙语以及英语-越南语等翻译中的实验结果表明,对双语平...

关 键 词:跨语种预训练语言模型  神经机器翻译  Transformer网络模型  XLM-R模型  微调
收稿时间:2021-06-12

Neural Machine Translation Based on XLM-R Cross-lingualPre-training Language Model
WANG Qian,LI Maoxi,WU Shuixiu,WANG Mingwen.Neural Machine Translation Based on XLM-R Cross-lingualPre-training Language Model[J].Acta Scientiarum Naturalium Universitatis Pekinensis,2022,58(1):29-36.
Authors:WANG Qian  LI Maoxi  WU Shuixiu  WANG Mingwen
Abstract:The authors explore the application of XLM-R cross-lingual pre-training language model into the source language, into the target language and into both of them to improve the quality of machine translation, and propose three neural network models, which integrate pre-trained XLM-R multilingual word representation into the Transformer encoder, into the Transformer decoder and into both of them respectively. The experimental results on WMT English-German, IWSLT English-Portuguese and English-Vietnamese machine translation benchmarks show that integrating XLM-R model into Transformer encoder can effectively encode the source sentences and improve the system performance for resource-rich translation task. For resource-poor translation task, integrating XLM-R model can not only encode the source sentences well, but also supplement the source language knowledge and target language knowledge at the same time, thus improve the translation quality.
Keywords:cross-lingual pre-training language model  neural machine translation  Transformer neural network  XLM-R model  fine-tuning
  
点击此处可从《北京大学学报(自然科学版)》浏览原始摘要信息
点击此处可从《北京大学学报(自然科学版)》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号