首页 | 本学科首页   官方微博 | 高级检索  
     检索      

WIDE:海量数据的聚类算法
引用本文:张强,赵政.WIDE:海量数据的聚类算法[J].天津大学学报(自然科学与工程技术版),2006,39(7):826-831.
作者姓名:张强  赵政
作者单位:天津大学电子信息工程学院,天津300072
摘    要:给出了一种新的处理海量数据的聚类算法WIDE(window-density clustering algorithm).它通过网格方法将数据之间的相互关联局部化,通过窗口技术来提高算法的效率,通过密度方法提高聚类的精度.以窗口为中介将网格方法和密度方法融合在一起是算法的主要思想.在此基础上对算法进行了扩展,在功能方面实现了混合型数据聚类、含障碍物数据聚类和增量数据聚类;在速度方面实现了分布式并行聚类.WIDE算法能够在局域网中的多台计算机上并行工作,效率高,计算复杂度为O(N),且能够发现任意形状的聚类,对噪声不敏感.

关 键 词:窗口  混合型数据  含障碍物数据聚类  增量聚类  分布式并行聚类
文章编号:0493-2137(2006)07-0826-06
收稿时间:2005-05-27
修稿时间:2005-05-272005-12-29

WIDE: Clustering Algorithm for Very Large Databases
ZHANG Qiang,ZHAO Zheng.WIDE: Clustering Algorithm for Very Large Databases[J].Journal of Tianjin University(Science and Technology),2006,39(7):826-831.
Authors:ZHANG Qiang  ZHAO Zheng
Institution:School of Electronic Information Engineering, Tianjin University, Tianjin 300072, China
Abstract:A new clustering algorithm WIDE(window-density clustering algorithm)was introduced for very large databases.The relevance of data is localized through grid method.It uses window method to obtain high efficiency and density method to obtain high accuracy.The main idea of WIDE is to integrate gird method and density method with window technique,which makes WIDE powerful.Based on it,some extensions were made.Three new functions are added:it can deal with data of mixed types;it can do clustering in a data space containing obstacle objects;it introduces incremental clustering method for new coming data in an already pro- cessed database.In order to speed up clustering,it realizes distribution parallel clustering that can be imple- mented on a number of workstations connected via local area network.The advantages are:it is very efficient with a complexity of O(N);it is effective in discovering clusters of arbitrary shape;it is not sensitive to noise.
Keywords:window  data of mixed types  obstacle clustering  incremental clustering  distribution parallel clustering
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号