首页 | 本学科首页   官方微博 | 高级检索  
     检索      

现代汉语复句中短语字段的自动识别初探
引用本文:李琼,胡金柱,俞小娟.现代汉语复句中短语字段的自动识别初探[J].洛阳大学学报,2007,22(3):62-66.
作者姓名:李琼  胡金柱  俞小娟
作者单位:1. 华中师范大学,语言研究所,湖北,武汉,430068
2. 华中师范大学,计算机科学系,湖北,武汉,430068
摘    要:为了建立一个面向中文信息处理的现代汉语复句深加工语料库,我们必须进行短语字段的自动识别工作.目的是把这些字段排除在分句层次分析的范围之外.这项工作建立在自动分词和词性标注的基础上,首先通过编写的程序把所有不含动词的字段暂时统一识别为短语字段.对于虽包含动词但前后有明显形式标志的字段则通过制定相应的规则来识别.还有一部分字段只包含一个动词,但前后却没有明显的形式标志,对此,需要利用字段中的结构助词"的"来帮助识别.

关 键 词:短语字段  形式标志  "的"语义  规则  统计
文章编号:1007-113X(2007)03-0062-05
修稿时间:2007-06-20

Automatic Identification of Phrase Field in Corpus of Contemporary Chinese Complex Sentences
LI Qiong,HU Jin-zhu,YU Xiao-juan.Automatic Identification of Phrase Field in Corpus of Contemporary Chinese Complex Sentences[J].Journal of Luoyang University,2007,22(3):62-66.
Authors:LI Qiong  HU Jin-zhu  YU Xiao-juan
Institution:1. Language Study Center, Huazhong Normal University, Wuhan 430079, China;2. Department of Computer Science, Huazhong Normal University, Wuhan 430079, China
Abstract:To build up a contemporary Chinese complex sentences corpus in an attempt to face information processing,we need to start a research on the automatic identification of phrase field,with the aim to exclude it from the non-clause.The whole project of the research is built on the basis of speech tagging.The first step is to exclude those fields not containing a verb and label them as phrase field.To those fields that contain verbs and have obvious markers in the beginning or ending,we mark them out them by establishing rules.And to those fields which obtain only one verb but do not have obvious markers,we tried to identify them with the help of structural auxiliary "de".In this process some syntactical and semantic knowledge is applied to.
Keywords:phrase field  formal sign  "de"  semantic  rule  statistic
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号