首页 | 本学科首页   官方微博 | 高级检索  
     检索      

BFS-CTC汉语句义结构标注语料库构建方法
引用本文:罗森林,刘盈盈,冯扬,韩磊,陈功,王倩.BFS-CTC汉语句义结构标注语料库构建方法[J].北京理工大学学报,2012,32(3):311-315.
作者姓名:罗森林  刘盈盈  冯扬  韩磊  陈功  王倩
作者单位:北京理工大学信息与电子学院,北京,100081;北京理工大学信息与电子学院,北京,100081;北京理工大学信息与电子学院,北京,100081;北京理工大学信息与电子学院,北京,100081;北京理工大学信息与电子学院,北京,100081;北京理工大学信息与电子学院,北京,100081
基金项目:国家"二四二"计划项目(2005C48);北京理工大学科技创新计划项目(2011CX01015)
摘    要:根据现代汉语语义学,构建了一种层次化的句义结构模型.基于该模型构建了汉语句义结构标注语料库(Beijing forest studio-Chinese tagged corpus,BFS-CTC).利用自行开发的标注和管理工具,对模型中各个句义成分及其组合关系进行快速标注,降低培训工作量和标注成本.BFS-CTC涵盖了6种句式类型,约1万句,提供了符合现有规范的词法和句法标注信息与自定义规范的句义结构标注信息,便于词法、句法和句义的对照分析研究,以及语料的综合使用和横向分析.此外,BFS-CTC还具有较强的可扩展性,可在核心标注库基础上扩展生成其它扩展库和标注资源.

关 键 词:中文信息处理  句义分析  句义结构  语义标注  语料库
收稿时间:2011/4/27 0:00:00

Method of Building BFS-CTC: a Chinese Tagged Corpus of Sentential Semantic Structure
LUO Sen-lin,LIU Ying-ying,FENG Yang,HAN Lei,CHEN Gong and WANG Qian.Method of Building BFS-CTC: a Chinese Tagged Corpus of Sentential Semantic Structure[J].Journal of Beijing Institute of Technology(Natural Science Edition),2012,32(3):311-315.
Authors:LUO Sen-lin  LIU Ying-ying  FENG Yang  HAN Lei  CHEN Gong and WANG Qian
Institution:School of Information and Electronics, Beijing Institute of Technology, Beijing 100081, China;School of Information and Electronics, Beijing Institute of Technology, Beijing 100081, China;School of Information and Electronics, Beijing Institute of Technology, Beijing 100081, China;School of Information and Electronics, Beijing Institute of Technology, Beijing 100081, China;School of Information and Electronics, Beijing Institute of Technology, Beijing 100081, China;School of Information and Electronics, Beijing Institute of Technology, Beijing 100081, China
Abstract:Based on the modern Chinese semantics, a Chinese sentential semantic mode is built, and then a Chinese tagged corpus, BFS-CTC (Beijing forest studio-Chinese tagged corpus), is built according to the Chinese sentential semantic mode. There are more than ten thousand sentences in the corpus, and the corpus contains six kinds of Chinese syntactic types. Tagging the sentence quickly and conveniently could be implemented by using the self-developed tools. BFS-CTC provides lexical, syntactic and sentential semantic structure tagging information, so that it could be used in comparative analysis of syntactic and semantic, or used for horizontal analysis. In addition, the corpus has good scalability, and it could generate more targeted extension tagged banks.
Keywords:Chinese information processing  sentential semantic analysis  sentential semantic structure  semantic labeling  corpus
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《北京理工大学学报》浏览原始摘要信息
点击此处可从《北京理工大学学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号