首页 | 本学科首页   官方微博 | 高级检索  
     检索      

树-串句法统计翻译模型的正向贪心解码算法
引用本文:薛永增,李生,赵铁军,杨沐昀.树-串句法统计翻译模型的正向贪心解码算法[J].东南大学学报(自然科学版),2007,37(5):803-807.
作者姓名:薛永增  李生  赵铁军  杨沐昀
作者单位:哈尔滨工业大学语言语音教育部微软重点实验室,哈尔滨,150001
基金项目:国家高技术研究发展计划(863计划)
摘    要:为了有效利用句法信息指导翻译过程,提出了基于贪心搜索的树-串句法统计翻译模型的正向解码算法.该算法以对数线性模型为整体框架,采用翻译模型概率、语言模型概率和空译文罚分作为特征函数.在解码过程中首先生成初始译文,然后通过遍历句法分析树反复迭代来改进译文.重点研究了解码过程中译文片断的打分方法.实验在IWSLT2004数据集上进行并采用BLEU方法评价翻译结果.实验结果表明正向贪心解码算法在翻译质量和速度上均好于现有的反向解码算法,这说明正向贪心解码算法能够更为有效地利用句法结构信息,更适合于树-串统计翻译模型.

关 键 词:统计机器翻译  句法  贪心  解码
文章编号:1001-0505(2007)05-0803-05
修稿时间:2007-03-05

Greedy direct decoding algorithm for syntax-based tree-to-string statistical translation model
Xue Yongzeng,Li Sheng,Zhao Tiejun,Yang Muyun.Greedy direct decoding algorithm for syntax-based tree-to-string statistical translation model[J].Journal of Southeast University(Natural Science Edition),2007,37(5):803-807.
Authors:Xue Yongzeng  Li Sheng  Zhao Tiejun  Yang Muyun
Institution:MOE-MS Key Laboratory of Natural Language Processing and Speech, Harbin Institute of Technology, Harbin 150001, China
Abstract:In order to effectively direct the translation process by syntax information,a greedy direct decoding algorithm is proposed for the syntax-based tree-to-string statistical translation model.The log-linear model is adopted as the framework and the feature functions are defined upon the translation model probability,the language model probability and the null translation penalty.The decoder firstly generates the initial translation gloss,and then improves the gloss by iteratively traversing the parse tree.The scoring methods for translation segments are described.The experiment was carried out on IWSLT 2004 data set.The translation results were evaluated by the BLEU metrics.Experimental results show that the greedy direct decoding algorithm gives better results than the current reverse decoding algorithm on translation quality and speed.This means that the greedy direct decoding algorithm can make more efficient use of syntactical information,thus is more suitable for the tree-to-string statistical translation model.
Keywords:statistical machine translation  syntax  greedy  decoding
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号