首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于分类算法的专利摘要文本分割技术
引用本文:丁长林,蔡东风,王裴岩.基于分类算法的专利摘要文本分割技术[J].山东大学学报(理学版),2012,47(5):68-72,77.
作者姓名:丁长林  蔡东风  王裴岩
作者单位:沈阳航空航天大学知识工程研究中心,辽宁沈阳,110136
摘    要:专利摘要是对专利的浓缩表述,将专利摘要按内容分割后,能更准确地定位对应的专利。由于专利摘要长度较短,而且不同内容间没有明显标志,使其分割不能使用传统的文本分割方法。本文将专利摘要的分割问题转化为句子分类问题,并尝试采用分类算法解决该问题。通过分析不同分类算法以及不同特征对本问题的解决效果,最终验证了利用句子分类方法进行专利摘要分割的可行性。

关 键 词:专利摘要  文本分割  句子单元  分类算法  词性

Text segmentation of patent summary based on a classification algorithm
DING Chang-lin,CAI Dong-feng,WANG Pei-yan.Text segmentation of patent summary based on a classification algorithm[J].Journal of Shandong University,2012,47(5):68-72,77.
Authors:DING Chang-lin  CAI Dong-feng  WANG Pei-yan
Institution:(Knowledge Engineering Research Center of Shenyang Aerospace University,Shenyang 110136,Liaoning,China)
Abstract:Patent summaries are condensed representation of the patents,and if patent summaries are divided by using their contents,the corresponding patents will be more accurately positioned.Because the length of each patent summary is too short and there are no signs between two different contents,the traditional text segmentation methods cannot be used.In this paper,the problem of text segmentation of a patent summary was changed into sentence classification,and the classification algorithms attempted to solve the problem.The effects of solving the problem with different classification algorithms and different features were analyzed,and the results proved that the segmentation method of the patent summaries by using the methods of sentence classification is feasible.
Keywords:patent summary  text segmentation  sentence unit  classification algorithm  part of speech
本文献已被 CNKI 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号