首页 | 本学科首页   官方微博 | 高级检索  
     检索      

一种基于规则的无监督词性标注方法
引用本文:彭涛,戴耀康,朱枫彤,张邦佐,刘露,闫昭,钱锋.一种基于规则的无监督词性标注方法[J].吉林大学学报(理学版),2015,53(5):956-962.
作者姓名:彭涛  戴耀康  朱枫彤  张邦佐  刘露  闫昭  钱锋
作者单位:1. 吉林大学 计算机科学与技术学院, 长春 130012; 2. 东北师范大学 计算机科学与信息技术学院, 长春 130117
摘    要:提出一种基于规则的无监督词性标注方法, 利用200多条英语语法规则, 创建26个规则函数, 先将输入的待标注英语句子进行预处理后得到初始标记, 再对每个单词调用规则函数, 最终得到标注后的英语句子. 通过对Brown语料库的实验, 词性标注的正确率达到9395%. 实验结果表明, 本文方法可行、 有效, 能很好地提高英语词性标注的准确率.

关 键 词:词性标注  基于规则  无监督学习  规则函数  
收稿时间:2014-09-24

Rule-Based Method for Unsupervised Part-of-Speech Tagging
PENG Tao,DAI Yaokang,ZHU Fengtong,ZHANG Bangzuo,LIU Lu,YAN Zhao,QIAN Feng.Rule-Based Method for Unsupervised Part-of-Speech Tagging[J].Journal of Jilin University: Sci Ed,2015,53(5):956-962.
Authors:PENG Tao  DAI Yaokang  ZHU Fengtong  ZHANG Bangzuo  LIU Lu  YAN Zhao  QIAN Feng
Institution:1. College of Computer Science and Technology, Jilin University, Changchun 130012, China;2. School of Computer Science and Information Technology, Northeast Normal University, Changchun 130117, China
Abstract:A rule based tagging method for unsupervised part of speech was proposed. More than 200 grammar rules were used to create 26 kinds of rules functions. After it was preprocessed, the initial tags of words in the input sentence were obtained, the 26 kinds of rules functions were applied to each word to attain all the tags of the input sentence. Experimental results on Brown corpus show that the accuracy of our method is up to 93.95%, thus, our rule based method is feasible and effective, and improves the accuracy and the simplicity of English part of speech tagging.
Keywords:part of speech tagging  rule based  unsupervised learning  rules function  
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《吉林大学学报(理学版)》浏览原始摘要信息
点击此处可从《吉林大学学报(理学版)》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号