分布式信息搜集系统中URL存储检索的设计与分析 Analysis and Design of URL Indexing in Distributed Information Retrieval System期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

分布式信息搜集系统中URL存储检索的设计与分析

引用本文：	宋晖,郑子颖,张岭,马范援.分布式信息搜集系统中URL存储检索的设计与分析[J].上海交通大学学报,2003,37(3):454-457.

作者姓名：	宋晖郑子颖张岭马范援

作者单位：	上海交通大学计算机科学与工程系,上海,200030

基金项目：	上海市科委重点基础科研项目 ( 0 2 DJ14 0 45 )

摘要：	URL的存储检索效率是构建大规模分布式信息搜集系统的关键，其决定了系统搜集Web文档的效率，对URL存储检索性能做定量分析，分别得出MRL存储及检索所需要达到的速度指标，在此基础上，提出了两种URL存储检索原型，即集中URL服务器存储检索和分布URL存储检索，并对这两种原型系统的检索速度，性能价格比，可扩展性以及可靠性进行了分析比较。实际应用中，可以根据优化目标选择相应的URL存储检索实现方式。
关键词：	分布式系统 Web信息搜集 URL存储检索
文章编号：	1006-2467(2003)03-0454-04
修稿时间：	2002年3月4日
Analysis and Design of URL Indexing in Distributed Information Retrieval System

SONG Hui,ZHENG Zi ying,ZHANG Ling,MA Fan yuan.Analysis and Design of URL Indexing in Distributed Information Retrieval System[J].Journal of Shanghai Jiaotong University,2003,37(3):454-457.

Authors:	SONG Hui ZHENG Zi ying ZHANG Ling MA Fan yuan

Abstract:	With the scale of World Wide Web increasing exponentially, the key technique of improving the distributed crawler system performance is the efficiency of URL storage and indexing. Based on the quantitative analyzing of the performance metrics of the URL index and storage,this paper presented two URL storage and index architectures in distributed crawler system: centralized URL server storage and index, distributed URL storage and index. The advantage and disadvantage of each were discussed. The distributed URL system was realized in our distributed crawler system, and the work is efficient.

Keywords:	distributed system Web Crawler URL storage and index
本文献已被 CNKI 维普万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏