首页 | 本学科首页   官方微博 | 高级检索  
     

一种改进的数据流聚类方法
引用本文:耿德志. 一种改进的数据流聚类方法[J]. 山西师范大学学报:自然科学版, 2014, 0(3): 22-25
作者姓名:耿德志
作者单位:晋中学院信息技术与工程学院
摘    要:针对传统K-均值方法不能有效处理动态变化的数据聚类的问题,本文提出了一种改进的数据流聚类技术——流式K-均值聚类(Streaming K-means Clustering,SKC).该方法首先对数据流中已经产生的初始数据块进行K-均值聚类,当数据流的新数据块到来时,通过衡量已经得到的聚类结果与新进入样本块的距离,对样本进行初步简单归类,并计算聚类结果的性能,若聚类结果性能在可接受范围内,则该数据块聚类结束,否则采用K-均值方法对新类进行深层次聚类.采用SKC的流式数据聚类方法处理数据流的聚类问题,对于整个数据流中的多数数据块都进行简单归类,只有少数数据块进行K-均值聚类,有效提高了数据流聚类的效率.实验结果表明,流式K-均值聚类方法能够有效处理数据流的聚类问题.

关 键 词:数据流  K-均值聚类  流式K-均值聚类

An Improved Data Stream Clustering Method
GENG De-zhi. An Improved Data Stream Clustering Method[J]. Journal of Shanxi Teachers University, 2014, 0(3): 22-25
Authors:GENG De-zhi
Affiliation:GENG De-zhi;School of Information Technology and Engineering,Jinzhong University;
Abstract:In this paper,an improved data stream solving method,called Streaming K-means Clustering( SKC) algorithm is presented. The initial samples of data stream are clustered by traditional K-means. Then the new coming samples are clustered simply by measuring the distance between the clustered results and new coming samples block,and the radius of new producing cluster is computed. If the new cluster is not satisfied the radius boundary,the new cluster is divided two clusters by K-means. By this method,most samples of data stream are clustered simply and only a part of them are clustered by traditional K-means. So the efficiency of data stream clustering is improved. Simulation results on standard datasets demonstrate that high clustering efficient and excellent clustering results can be obtained on data stream clustering problem by SKC method.
Keywords:data streaming  K-means clustering  streaming K-means clustering
本文献已被 CNKI 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号