首页 | 本学科首页   官方微博 | 高级检索  
     检索      

上下文相关汉语自动分词及词法预处理算法
引用本文:黄河燕,李渝生.上下文相关汉语自动分词及词法预处理算法[J].应用科学学报,1999,17(2):148-155.
作者姓名:黄河燕  李渝生
作者单位:中国科学院计算机语言信息工程研究中心
摘    要:提出了一种适合于汉英机器翻译的上下文相关汉语自动分词及词法预处理算法,该算法采用正向多路径匹配算法和基于上下相关知识的歧义切分消解算法,充分利用汉英机译系统词典库中的大量语法和语义等知识进行了上下文相关的规则推导消歧,使自动分词的准确率达到了99%以上,同时,该算法还对汉语中意义冗余的重叠词和可以与中心词离合的虚词等进行了词法预处理,从而一方面可以减少系统词典的收词量,另一方面方便于对句子的分析处

关 键 词:汉语自动分词  词法预处理  机器翻译  上下文相关

Context Sensitive Automatic Chinese Word Segmentation and Lexical Preprocessing
HUANG HEYAN,LI YUSHENG.Context Sensitive Automatic Chinese Word Segmentation and Lexical Preprocessing[J].Journal of Applied Sciences,1999,17(2):148-155.
Authors:HUANG HEYAN  LI YUSHENG
Abstract:In this paper, a context sensitive automatic Chinese word segmentation and lexical preprocessing for Chinese English machine translation system is proposed. This algorithm incorporates with improved MM matching and rule based context sensitive ambiguity resolution by taking advantage of large amount of syntax, semantic and common sense knowledge in the lexicon of MT system. Its accurate rate reaches up to 99%. On the same time, in this algorithm, some lexical phonomena, such as reduplication word, function word, etc. are also processed, so as to deduce the amont of words in lexicon entry, and facilitate the parsing of a Chinese sentence.
Keywords:automatic Chinese word segmentation    lexical preprocessing  machine translation
本文献已被 CNKI 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号