首页 | 本学科首页   官方微博 | 高级检索  
     检索      

最大熵和Brill方法结合识别英语BaseNPs
引用本文:吕琳,刘玉树.最大熵和Brill方法结合识别英语BaseNPs[J].北京理工大学学报,2006,26(6):500-503.
作者姓名:吕琳  刘玉树
作者单位:北京理工大学,计算机科学技术学院,北京,100081;北京理工大学,计算机科学技术学院,北京,100081
摘    要:为了进一步提高基本名词短语(BaseNPs)的识别精度,针对最大熵方法和Brill方法各自的特点,提出基于两者相结合的英语基本名词短语识别算法.该算法是在高准确率词性标注的基础上实现的.在训练和测试两个阶段中,均先采用最大熵方法识别基本名词短语,然后将已具有很高精度的识别结果作为初始标注结果运用于Brill方法中.实验结果表明,此联合算法达到了94%的准确率和召回率,充分融合了最大熵方法和Brill方法的优点,可与基于相同训练和测试语料的目前最理想的英语基本名词短语识别结果相比.

关 键 词:基本名词短语  短语识别  最大熵  Brill方法
文章编号:1001-0645(2006)06-0500-04
收稿时间:10 27 2005 12:00AM
修稿时间:2005年10月27日

Identifying English BaseNPs Through a Combination of Maximum Entropy Approach and Brill Approach
L Lin,LIU Yu-shu.Identifying English BaseNPs Through a Combination of Maximum Entropy Approach and Brill Approach[J].Journal of Beijing Institute of Technology(Natural Science Edition),2006,26(6):500-503.
Authors:L Lin  LIU Yu-shu
Institution:School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
Abstract:To increase further the accuracy of BaseNP identification and utilize features of the maximum entropy approach and the Brill approach,an English BaseNPs identification algorithm based on a combined approach is presented.The algorithm is based on a high-performance POS(parts of speech) tagger.During the training phase and the application phase,maximum entropy approach is first applied to the initialization process of Brill approach,and the Brill approach is then run on its results already having high accuracy.Experimental results showed that this combined algorithm achieved a high precision and recall rate of over 94%,fully inosculating the strength of the maximum entropy approach and the Brill approach.It is comparable to the most ideal results of existing English BaseNP identification based on the same training and testing corpus.
Keywords:BaseNP  phrase identification  maximum entropy  Brill approach
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号