首页 | 本学科首页   官方微博 | 高级检索  
     

基于自动查询扩展的专利文档检索方法
引用本文:羊帅,王锋,林兰芬,朱晓伟,谢非. 基于自动查询扩展的专利文档检索方法[J]. 中国科技论文在线, 2013, 0(10): 1057-1063
作者姓名:羊帅  王锋  林兰芬  朱晓伟  谢非
作者单位:浙江大学计算机科学与技术学院,杭州310027
基金项目:高等学校博士学科点专项科研基金资助项目(20110101110065);浙江省创新团队计划资助项目(2009R50015)
摘    要:针对现有专利检索中的用户意图理解及查询扩展不足问题,提出了一种基于自动查询扩展的专利文档检索方法。首先结合专利文档特点,采用基于改进TF-IDF公式的专利领域词表提取方法,构建专利领域词表。在检索阶段,对查询输入串进行分析得到查询关键词汇,同领域词表相结合,确定查询所在领域及查询扩展难度。利用基于伪相关反馈的自动查询扩展技术,根据伪相关文档的术语分布差异分析,生成查询扩展项并排序,最后将扩展项与原始查询条件相结合,重新组成查询条件,完成专利查询。实验结果表明,该方法具有较高的召回率和平均准确率。

关 键 词:人工智能  专利检索  领域词表  查询扩展  伪相关反馈

A patent retrieval method based on automatic query expansion
Yang Shuai,Wang Feng,Lin Lanfen,Zhu Xiaowei,Xie Fei. A patent retrieval method based on automatic query expansion[J]. Sciencepaper Online, 2013, 0(10): 1057-1063
Authors:Yang Shuai  Wang Feng  Lin Lanfen  Zhu Xiaowei  Xie Fei
Affiliation:(College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China)
Abstract:Existing patent retrieval methods cannot effectively capture user's query intents due to the lack in query expansion. To solve this problem, we propose a novel patent retrieval method based on automatic query expansion. Considering the characteris- tics of patent documents, an improved TF-IDF scheme is first adopted to extract patent domain terms and build the domain vocab- ularies. At the retrieval stage, query inputs are analyzed to extract key words, and then the field of query and the difficulty of query expansion are determined based on domain vocabularies. Furthermore, according to the term distribution variation analysis on pseudo related documents, the pseudo relevance feedback (PRF)-based automatic query expansion techniques are utilized to generate and rank the candidate expansion terms. At last, the expansion terms are combined with original query conditions to compose the final query conditions for searching. The comparative experiment results show that our method achieves better recall and average precision.
Keywords:artificial intelligence  patent retrieval  domain vocabulary  query expansion  PRF
本文献已被 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号