一种改进的基于密度和样本数量的K-means算法 Improved K-means Clustering Algorithm Based Density and Sample Size期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

一种改进的基于密度和样本数量的K-means算法

引用本文：	ZHAO Da-wei Xiao Zhou-fang ﹙.School of Computer Science and Technology,China University of Mining and Technology,Jiangsu,Xuzhou ,P.R.China, .School of Computer Science and Technology,China University of Mining and Technology,Jiangsu,Xuzhou ,P.R.China﹚.一种改进的基于密度和样本数量的K-means算法[J].科技信息,2008(28).

作者姓名：	ZHAO Da-wei Xiao Zhou-fang ﹙.School of Computer Science and Technology China University of Mining and Technology Jiangsu Xuzhou P.R.China .School of Computer Science and Technology China University of Mining and Technology Jiangsu Xuzhou P.R.China﹚

作者单位：	中国矿业大学计算机科学与技术系

摘要：	对原始K-means算法进行了研究,通过改进,算法能够自动找出合适的k值,并且最大限度的找出孤立点。首先,寻找样本容量的最大可能初始聚类数n。然后做样本圆,将样本圆等分为n份,依据样本点的位置将样本归属到相应的份里,对初始的n个类进行聚类。最后通过应用DBSCAN算法的小类合并策略将需要合并的小类进行了合并。为了测试改进算法的聚类性能,将改进后的算法源码放在新西兰怀卡托大学所开发的开源平台"weka"上,在多个数据集上与原始K-means算法进行了对比实验,验证了改进算法在聚类质量和聚类稳定性上都远优于原始K-means算法。
关键词：	数据挖掘聚类算法 K-means DBSCAN
Improved K-means Clustering Algorithm Based Density and Sample Size

ZHAO Da-wei,Xiao Zhou-fang.Improved K-means Clustering Algorithm Based Density and Sample Size[J].Science,2008(28).

Authors:	ZHAO Da-wei Xiao Zhou-fang

Institution:	1.School of Computer Science and Technology; China University of Mining and Technology; Jiangsu; Xuzhou 221008; P.R.China; 2.School of Computer Science and Technology; P.R.China﹚;

Abstract:	Studying and improving the K-means algorithm, it can search the proper value k automatically, and try the best to find the isolated points. Firstly, The algorithm calculates the most probable number of clustering. Then, finding the sample circle, dividing the cicle, swatches are distributed to their shares by their positions. Followed, the improved algorithm puts clustering into practice. Lastly, some small classes are combined by adopting DBSCAN algorithm. The source code of the improved algorithm is put into the open platform "weak" which is developed by University of Waikato, New Zealand to test the performance of the improved algorithm. It has compared with the original K-means algorithm in many datas, which proves that it precedes the original K-means algorithm in quality and stability.

Keywords:	data mining clustering algorithm k-means DBSCAN
本文献已被 CNKI 维普等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏