首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于英汉平行语料库的术语组块自动抽取
引用本文:杨福义.基于英汉平行语料库的术语组块自动抽取[J].中国科技术语,2018,20(2):12-17.
作者姓名:杨福义
作者单位:鞍山师范学院,辽宁鞍山 114006
摘    要:双语平行语料库的数据资源建设是语言工程的前端。其中包含大量的术语及语言翻译知识。深入研究和开发双语语料库,对术语翻译具有重要意义。文章论述了平行语料库的深加工流程和中文语料标注的自动化加工。使用“语法符号语言”建立文本的语法映像,生成短语组块库。按短语结构规则采用人工智能方法自动抽取术语翻译组块,自动生成术语组块词典与词表,列出部分术语组块查询应用的实例和逆向追踪双语例句的实例。

关 键 词:计算术语学  语料库  知识抽取  术语部件  组块  
收稿时间:2017-11-01

Automatic Extraction of Term Chunks Based on Parallel Corpora of English and Chinese
YANG Fuyi.Automatic Extraction of Term Chunks Based on Parallel Corpora of English and Chinese[J].Chinese Science and Technology Terms Journal,2018,20(2):12-17.
Authors:YANG Fuyi
Abstract:The construction of data resources of bilingual parallel corpora is the front end of language engineering, and contains a large number of terms and language translation knowledge. Full use of bilingual corpora for further research and development is of great significance to terminology translation. This article discusses the deep processing flow of parallel corpora and automatic processing of Chinese corpus annotation. Using the grammar symbol language, the grammar image of the text is set up, and the phrase chunk library is generated. According to the rules of phrase structure, the term translation chunk is automatically extracted by the method of artificial intelligence, and the lexicon and thesaurus of term chunks are automatically generated. Moreover, some examples of the application of terminology block query and examples of reverse tracing bilingual examples are listed.
Keywords:computational terminology  corpus  knowledge extraction  component  term block  
本文献已被 CNKI 等数据库收录!
点击此处可从《中国科技术语》浏览原始摘要信息
点击此处可从《中国科技术语》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号