首页 | 本学科首页   官方微博 | 高级检索  
     检索      

信息检索中一种句子相似度的计算方法
引用本文:刘云芳,杨燕,贾真,尹红风,杨宇飞.信息检索中一种句子相似度的计算方法[J].应用科技,2014(4):41-46.
作者姓名:刘云芳  杨燕  贾真  尹红风  杨宇飞
作者单位:西南交通大学信息科学与技术学院,四川成都610031
基金项目:国家自然科学基金资助项目(61170111,61152001);中国科学院自动化所复杂系统管理与控制重点实验室开放课题资助项目(20110102);中央高校基本科研业务费专项基金资助项目(SWJTUllZT08)
摘    要:为提高信息检索中检索结果的查准率,提出了基于句法分析以及带权路径长度的句子相似度计算方法。该方法首先对用户问句进行了分词、词性标注以及句法分析处理,并根据处理后的结果对该句进行了关键词提取、加权和同义词近义词扩展处理。然后提出了基于带权路径长度计算的方法,并用该方法计算用户问句与检索信息标题句之间的相似度,即问句的带权路径长度与标题句的带权路径长度的相对比值,以此对检索结果进行二次排序,提高检索结果查准率。实验表明,该句子相似度方法能有效地提高信息检索中检索结果的查准率。

关 键 词:信息检索  相似度  词性标注  句法分析  带权路径长度  二次排序  查准率

A calculation method of the sentence similarity in information retrieval
LIU Yunfang,YANG Yan,JIA Zhen,YIN Hongfeng,YANG Yufei.A calculation method of the sentence similarity in information retrieval[J].Applied Science and Technology,2014(4):41-46.
Authors:LIU Yunfang  YANG Yan  JIA Zhen  YIN Hongfeng  YANG Yufei
Institution:(School of Information and Science Technology, Southwest Jiaotong University, Chengdu 610031 ,China)
Abstract:In order to improve the precision ratio of retrieval results in information retrieval, a calculation method of the sentence similarity based on the syntactic analysis and weighted path length is been proposed .In this method , firstly, word segmentation, part-of-speech tagging and syntactic analysis are processed for a user question .Accord-ing to the processing result of the user question , the extraction , weighting , synonyms expansion and homoionym ex-pansion are conducted for the keywords in this user question .Then the method based on weighted path length calcu-lation is proposed in this paper .Using this method , the similarity between the user question and retrieval of infor-mation title words is calculated .The similarity also can be regarded as the relative ratio between the weighted path length of taglines and weighted path length of questions .Therefore , relying on the similarities , retrieval results is secondarily sorted and the recall and precision of results of information retrieval are improved .Experiments show that this method of sentence similarity calculation can improve the precision of retrieved result in information retriev -al .
Keywords:information retrieval  similarity  part-of-speech tagging  syntactic analysis  weighted path length  key-word  precision  radio
本文献已被 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号