基于伪数据的机器翻译质量估计模型的训练 Training Machine Translation Quality Estimation Model Based on Pseudo Data期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

基于伪数据的机器翻译质量估计模型的训练

引用本文：	吴焕钦,张红阳,李静梅,朱俊国,杨沐昀,李生.基于伪数据的机器翻译质量估计模型的训练[J].北京大学学报(自然科学版),2018,54(2):279-285.

作者姓名：	吴焕钦张红阳李静梅朱俊国杨沐昀李生

作者单位：	哈尔滨工业大学计算机科学与技术学院,哈尔滨,150001;哈尔滨工程大学计算机科学与技术学院,哈尔滨,150001

基金项目：	国家高技术研究发展计划，国家自然科学基金

摘要：	为向基于深度学习的机器翻译质量估计模型提供高效的训练数据, 提出面向目标数据集的伪数据构造方法, 采用基于伪数据预训练与模型精调相结合的两阶段模型训练方法对模型进行训练, 并针对不同伪数据规模设计实验。结果表明, 在构造得到的伪数据下, 利用两阶段训练方法训练得到的机器翻译质量估计模型给出的得分与人工评分的相关性有显著的提升。
关键词：	机器翻译质量估计深度学习伪数据
收稿时间：	2017-06-05
Training Machine Translation Quality Estimation Model Based on Pseudo Data

WU Huanqin,ZHANG Hongyang,LI Jingmei,ZHU Junguo,YANG Muyun,LI Sheng.Training Machine Translation Quality Estimation Model Based on Pseudo Data[J].Acta Scientiarum Naturalium Universitatis Pekinensis,2018,54(2):279-285.

Authors:	WU Huanqin ZHANG Hongyang LI Jingmei ZHU Junguo YANG Muyun LI Sheng

Institution:	1. Computer Science and Technology, Harbin Institute of Technology, Harbin 150001 2. Computer Science and Technology, Harbin Engineering University, Harbin 150001

Abstract:	Aimed at providing efficient training data for neural translation quality estimation model, a pseudo data construction method for target dataset is proposed, the model is trained by two stage model training method: pre training based on pseudo data and fine tuning. The experimental design of different pseudo data scale is carried out. The experiment results show that the machine translation quality estimation model trained by the pseudo data has significantly improved in the correlation between the scores given by human and the artificial scores.

Keywords:	machine translation quality estimation deep learning pseudo data
本文献已被 CNKI 万方数据等数据库收录！
	点击此处可从《北京大学学报(自然科学版)》浏览原始摘要信息
	点击此处可从《北京大学学报(自然科学版)》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏