首页 | 本学科首页   官方微博 | 高级检索  
     检索      

一种个性化的主题提取和层次发现算法
引用本文:傅向华,马兆丰,何明,冯博琴.一种个性化的主题提取和层次发现算法[J].西安交通大学学报,2005,39(2):119-122.
作者姓名:傅向华  马兆丰  何明  冯博琴
作者单位:西安交通大学电子与信息工程学院,710049,西安
基金项目:国家高技术研究发展计划资助项目(2003AA1Z2610).
摘    要:从语义相关性角度分析超链归纳主题搜索(HITS) 算法,发现其产生主题漂移的原因在于页面被投影到错误的语义基上,因此提出了一种个性化的主题提取和层次发现算法(PTDHE),通过个人查询日志扩展查询词,构造符合用户需要的个性化根集和基础集合,达到防止主题漂移的目的.PTDHE采用基于最小最大原则的图划分方法,层次地发现与用户查询相关的主题页面集合,利用HITS算法分别计算每个主题页面集合中页面的权威值,返回与查询相关的其他主题权威页面.在14个查询上的实验结果表明,与HITS算法相比,PTDHE算法不仅可以减少2%~66%的主题漂移率,而且可以发现与查询相关的多个主题.

关 键 词:链接分析  超链归纳主题搜索  主题提取  主题漂移  查询扩展
文章编号:0253-987X(2005)02-0119-04
修稿时间:2004年4月19日

New Algorithm for Personalized Topic Distillation and Hierarchical Exploration
FU Xianghua,Ma Zhao feng,He Ming,Feng Boqin.New Algorithm for Personalized Topic Distillation and Hierarchical Exploration[J].Journal of Xi'an Jiaotong University,2005,39(2):119-122.
Authors:FU Xianghua  Ma Zhao feng  He Ming  Feng Boqin
Abstract:To interpret the procedure of hypertext induced topic search (HITS) based on a semantic relation model, the reason about the topic drift of HITS was found that Web pages are projected to a wrong latent semantic basis. A new algorithm for personalized topic distillation and hierarchical exploration (PTDHE) was presented to improve the quality of topic distillation. Personalized root set and base set with query expansion was constructed using individual query logs to avoid the topic draft, and applying a hierarchical division algorithm based on min-max principle to explore relative topics of user query, and then (using) HITS to evaluate and return authority pages of relative topics to end-users. The experimental results on 14 queries show that PTDHE performs better than HITS in topic distillation quality and topic exploration ability. PTDHE reduces topic drift rate by 2% to 66% compared to that of HITS, and discovers several relative topics to queries that have multiple meanings.
Keywords:link analysis  hypertext induced topic search  topic distillation  topic drift  query expansion
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号