首页 | 本学科首页   官方微博 | 高级检索  
     

基于转换的无指导词义标注方法
引用本文:李涓子,黄昌宁. 基于转换的无指导词义标注方法[J]. 清华大学学报(自然科学版), 1999, 39(7): 301
作者姓名:李涓子  黄昌宁
作者单位:清华大学,计算机科学与技术系,北京,100084
摘    要:词义标注是自然语言处理的难题之一。该文提出用于文本词义标注的转换规则自动获取算法及相应的词义排歧算法。该算法用可能的句法关系对语境进行限制,减少了训练数据中的噪音; 为提高学习算法的速度,提出利用预排序方法减少规则搜索次数,以及只调整变化部分数据的计算方法; 并给了改善召回率的词义排歧算法。在近5 万词的语料库上对本算法进行了实验,开放测试的词义排歧正确率为743% 。

关 键 词:自然语言处理  词义标注  无指导学习
修稿时间:1998-07-09

Unsupervised word sense tagging method based on transformation rules
LI Juanzi,HUANG Changning. Unsupervised word sense tagging method based on transformation rules[J]. Journal of Tsinghua University(Science and Technology), 1999, 39(7): 301
Authors:LI Juanzi  HUANG Changning
Abstract:Word sense tagging is one of the most difficult problem in natural language processing. The paper puts forward an algorithm that can automatically learn sense tagging oriented transformative rules and presents a corresponding word sense disambiguation algorithm. By confining the context to possible syntactic relations, the learning algorithm greatly decreases the noise in training data; In order to increase the learning speed, the algorithm uses pre ordering method and only calculates the effected data. The word sense algorithm proposed can increase the recall greatly. Finally, an experiment is performed on a corpus of about 50 thousands words, and the precision in open test is 74.3%.
Keywords:natural language processing  word sense tagging  unsupervised learning
本文献已被 CNKI 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号