首页 | 本学科首页   官方微博 | 高级检索  
     

一种基于边界识别的聚类算法
引用本文:张选平,祝兴昌,马琮. 一种基于边界识别的聚类算法[J]. 西安交通大学学报, 2007, 41(12): 1387-1390,1395
作者姓名:张选平  祝兴昌  马琮
作者单位:西安交通大学计算机科学与技术系,710049,西安
摘    要:针对基于密度的聚类算法由高密度区到低密度区的处理顺序所带来的不能识别低密度对象类别的缺陷,通过对聚类过程中可能存在的边界识别进行讨论,提出了一种基于边界识别的聚类算法.该算法的思想是:同簇优先权高于密度优先权,即在选择下一个对象进行聚类时,在已聚类的对象中优先选择同一簇的对象,当对象沿某一方向扩展到达簇边界时停止扩展,转而向其他方向扩展,这种处理顺序能使得类别最大化.通过分析簇边界的密度变化特征,建立了边界识别准则,并根据该准则对数据进行聚类.通过在合成数据和美国加州大学提供的知识挖掘数据库数据集上的实验结果表明,所提算法能有效地处理低密度区域的数据,与识别聚类结构的对象排序算法相比,聚类效果可提高4%左右,而时间性能相当.

关 键 词:聚类算法  密度  边界识别
文章编号:0253-987X(2007)12-1387-04
收稿时间:2007-03-02
修稿时间:2007-03-02

Clustering Algorithm Based on Boundary Identification
Zhang Xuanping,Zhu Xingchang,Ma Cong. Clustering Algorithm Based on Boundary Identification[J]. Journal of Xi'an Jiaotong University, 2007, 41(12): 1387-1390,1395
Authors:Zhang Xuanping  Zhu Xingchang  Ma Cong
Abstract:Focusing on the default that in the clustering algorithm based on density the objects are processed from high-density area to low-density area, thereby, the objects with low density can not be identified, a novel clustering algorithm based on boundary identification is proposed through discussing the boundary identification existed in clustering process. The main idea of the algorithm is that the objects belonging to an accumulated cluster have higher priority than the density priority, i.e. objects belonging to the same accumulated cluster will be clustered first before next clustering is processed. When the object extension reaches the boundary of cluster, the extension is stopped and turns to other direction. This method can maximize each cluster. After analyzing the density features of the cluster boundary, a boundary identification rule is created and data is clustered according to it. Experiments with synthetic data set and UCI KDD data sets demonstrate that the proposed algorithm is specially suited for processing objects with low-density, and the new algorithm can improve the performance by 4 % while keeping same time complexity compared to other algorithms.
Keywords:clustering algorithm   density   boundary identification
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号