首页 | 本学科首页   官方微博 | 高级检索  
     检索      

排序融合算法在校园网搜索引擎中的应用
引用本文:李粤,安捷,李星.排序融合算法在校园网搜索引擎中的应用[J].大连理工大学学报,2005,45(Z1):257-260.
作者姓名:李粤  安捷  李星
作者单位:1. 清华大学,电子工程系,北京,100084
2. 清华大学,网络中心,北京,100084
摘    要:网页排序技术是搜索引擎的核心技术之一. 校园网搜索引擎是指以一个校园网内的Web网页为搜索内容的搜索引擎. 由于校园网相对于互联网和内联网的特殊性,各种启发式条件对校园网网页排序优化的影响及排序融合技术在校园网搜索引擎的作用是研究的重点. 实验结果表明各个启发式条件的影响和实验数据集有关,而不同启发式条件组合经过排序融合后所获得的查全率差别很大(2%~48%). 查全率大于35%的启发式条件组合至少包含4个启发式条件,即校园网搜索引擎的排序需要依据数据集综合考虑多个启发式条件的排序结果. 排序融合技术是校园网搜索引擎具有良好的查全率的必要技术之一. 基于排序融合技术的网页排序模块已经应用于清华大学校园网搜索引擎中.

关 键 词:搜索引擎  马尔可夫链  排序融合技术  启发式条件  查全率
文章编号:0253-9721(2005)06-0057-03

Application of rank aggregation to campus network search engine
LI Yue,AN Jie,LI Xing.Application of rank aggregation to campus network search engine[J].Journal of Dalian University of Technology,2005,45(Z1):257-260.
Authors:LI Yue  AN Jie  LI Xing
Abstract:Relevance ranking is one of the key technologies for web pages search engine.Campus network search engine(CNSE) focuses on web information within a certain campus network,which has its own characteristics compared with Internet and Intranets.The influence of heuristic evidence in web page ranking and the performance of rank aggregation to CNSE were analyzed.The impact of each heuristic evidence differs in different data sets,and the recall of each combination of subsets of heuristics varies from 2% to 48%.The combination whose recall is over 35% includes at least four heuristics,that is,a few heuristics should be considered according to dataset in ranking system.The experimental results show that rank aggregation technology is necessary for producing robust results in CNSE.The rank aggregation algorithm has been deployed in Tsinghua University campus network search engine.
Keywords:search engine  Markov chain  rank aggregation  heuristic evidence  recall
本文献已被 CNKI 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号