现代汉语复句中短语字段的自动识别初探 Automatic Identification of Phrase Field in Corpus of Contemporary Chinese Complex Sentences期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

现代汉语复句中短语字段的自动识别初探

引用本文：	李琼,胡金柱,俞小娟.现代汉语复句中短语字段的自动识别初探[J].洛阳大学学报,2007,22(3):62-66.

作者姓名：	李琼胡金柱俞小娟

作者单位：	1. 华中师范大学,语言研究所,湖北,武汉,430068 2. 华中师范大学,计算机科学系,湖北,武汉,430068

摘要：	为了建立一个面向中文信息处理的现代汉语复句深加工语料库,我们必须进行短语字段的自动识别工作.目的是把这些字段排除在分句层次分析的范围之外.这项工作建立在自动分词和词性标注的基础上,首先通过编写的程序把所有不含动词的字段暂时统一识别为短语字段.对于虽包含动词但前后有明显形式标志的字段则通过制定相应的规则来识别.还有一部分字段只包含一个动词,但前后却没有明显的形式标志,对此,需要利用字段中的结构助词"的"来帮助识别.
关键词：	短语字段形式标志 "的"语义规则统计
文章编号：	1007-113X（2007）03-0062-05
修稿时间：	2007-06-20
Automatic Identification of Phrase Field in Corpus of Contemporary Chinese Complex Sentences

LI Qiong,HU Jin-zhu,YU Xiao-juan.Automatic Identification of Phrase Field in Corpus of Contemporary Chinese Complex Sentences[J].Journal of Luoyang University,2007,22(3):62-66.

Authors:	LI Qiong HU Jin-zhu YU Xiao-juan

Institution:	1. Language Study Center, Huazhong Normal University, Wuhan 430079, China;2. Department of Computer Science, Huazhong Normal University, Wuhan 430079, China

Abstract:	To build up a contemporary Chinese complex sentences corpus in an attempt to face information processing,we need to start a research on the automatic identification of phrase field,with the aim to exclude it from the non-clause.The whole project of the research is built on the basis of speech tagging.The first step is to exclude those fields not containing a verb and label them as phrase field.To those fields that contain verbs and have obvious markers in the beginning or ending,we mark them out them by establishing rules.And to those fields which obtain only one verb but do not have obvious markers,we tried to identify them with the help of structural auxiliary "de".In this process some syntactical and semantic knowledge is applied to.

Keywords:	phrase field formal sign "de" semantic rule statistic
本文献已被 CNKI 维普万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏