基于英汉平行语料库的术语组块自动抽取 Automatic Extraction of Term Chunks Based on Parallel Corpora of English and Chinese期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

基于英汉平行语料库的术语组块自动抽取

引用本文：	杨福义.基于英汉平行语料库的术语组块自动抽取[J].中国科技术语,2018,20(2):12-17.

作者姓名：	杨福义

作者单位：	鞍山师范学院,辽宁鞍山 114006

摘要：	双语平行语料库的数据资源建设是语言工程的前端。其中包含大量的术语及语言翻译知识。深入研究和开发双语语料库,对术语翻译具有重要意义。文章论述了平行语料库的深加工流程和中文语料标注的自动化加工。使用“语法符号语言”建立文本的语法映像,生成短语组块库。按短语结构规则采用人工智能方法自动抽取术语翻译组块,自动生成术语组块词典与词表,列出部分术语组块查询应用的实例和逆向追踪双语例句的实例。
关键词：	计算术语学语料库知识抽取术语部件组块
收稿时间：	2017-11-01
Automatic Extraction of Term Chunks Based on Parallel Corpora of English and Chinese

YANG Fuyi.Automatic Extraction of Term Chunks Based on Parallel Corpora of English and Chinese[J].Chinese Science and Technology Terms Journal,2018,20(2):12-17.

Authors:	YANG Fuyi

Abstract:	The construction of data resources of bilingual parallel corpora is the front end of language engineering, and contains a large number of terms and language translation knowledge. Full use of bilingual corpora for further research and development is of great significance to terminology translation. This article discusses the deep processing flow of parallel corpora and automatic processing of Chinese corpus annotation. Using the grammar symbol language, the grammar image of the text is set up, and the phrase chunk library is generated. According to the rules of phrase structure, the term translation chunk is automatically extracted by the method of artificial intelligence, and the lexicon and thesaurus of term chunks are automatically generated. Moreover, some examples of the application of terminology block query and examples of reverse tracing bilingual examples are listed.

Keywords:	computational terminology corpus knowledge extraction component term block
本文献已被 CNKI 等数据库收录！
	点击此处可从《中国科技术语》浏览原始摘要信息
	点击此处可从《中国科技术语》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏