一种高性能分布式Web Crawler的设计与实现 Design and Implementation of a Distributed High-Performance Web Crawler期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

一种高性能分布式Web Crawler的设计与实现

引用本文：	张岭,叶允明,宋晖,于水,马范援.一种高性能分布式Web Crawler的设计与实现[J].上海交通大学学报,2004,38(1):59-61.

作者姓名：	张岭叶允明宋晖于水马范援

作者单位：	上海交通大学,计算机科学与工程系,上海,200030

基金项目：	上海市科委重点基础研究项目(02DJ14045)

摘要：	介绍了一种大规模、高性能、分布式的Web信息搜集器的设计及其Java实现．提出了Crawler设计中数据结构、系统功能模块和相关算法新的设计思想；对设计与实现过程中需要解决的关键问题分布式协调机制、基于内存的URL存储管理等进行了讨论，并提供了现阶段的设计、实现方法和分布式无损链接分析算法．
关键词：	Web信息搜集器分布式系统搜索引擎
文章编号：	1006-2467(2004)01-0059-03
修稿时间：	2002年12月26
Design and Implementation of a Distributed High-Performance Web Crawler

ZHANG Ling,YE Yun-ming,SONG Hui,YU Shui,MA Fan-yuan.Design and Implementation of a Distributed High-Performance Web Crawler[J].Journal of Shanghai Jiaotong University,2004,38(1):59-61.

Authors:	ZHANG Ling YE Yun-ming SONG Hui YU Shui MA Fan-yuan

Abstract:	Web crawler is the core component of WWW search engine and information retrieval systems. This paper discussed the architecture of a distributed Web crawler and the design ideas about the Web crawler data structure, system modules and related algorithms. The key problems encountered in the design and implementations were also commented, and the solutions to those problems were presented.

Keywords:	Web crawler distributed system search engine Java
本文献已被 CNKI 维普万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏