基于类间差异最大化的加权距离改进K-means算法 An improved K-means algorithm by weighted distance based on maximum between-cluster variation期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

基于类间差异最大化的加权距离改进K-means算法

引用本文：	张雪凤,刘鹏.基于类间差异最大化的加权距离改进K-means算法[J].山东大学学报(理学版),2010,45(7):28-33.

作者姓名：	张雪凤刘鹏

作者单位：	1. 上海财经大学信息管理与工程学院, 上海 200433; 2. 上海财经大学继续教育学院, 上海 200080

基金项目：	上海财经大学'211工程'三期重点学科建设项目

摘要：	为了改善K-means算法的聚类效果，将聚类准则函数定义为加权的类内误差平方总和SSE(sum of the squared error),并调整了K-means算法迭代过程中重新分配数据对象的方法:使用一个带有类内数据对象数的加权距离作为重新分配数据对象的依据,同时按类间差异最大化为准则优化了加权距离中的参数。实验表明,改进后的K-means算法可以在很大程度上减少大类被拆分情况的发生,明显改善聚类效果。
关键词：	K-means算法聚类类间差异加权距离
收稿时间：	2010-04-02
An improved K-means algorithm by weighted distance based on maximum between-cluster variation

ZHANG Xue-feng,LIU Peng.An improved K-means algorithm by weighted distance based on maximum between-cluster variation[J].Journal of Shandong University,2010,45(7):28-33.

Authors:	ZHANG Xue-feng LIU Peng

Institution:	1. School of Information Management and Engineering, Shanghai University of Finance ＆ Economics, Shanghai 200433, China; 2. School of Continuing Education, Shanghai University of Finance ＆ Economics, Shanghai 200080, China

Abstract:	To find natural clusters, the criterion function was improved by being defined as the weighted sum of the squared error. The way each point being assigned to the centroid in the iteration of the K-means algorithm was also modified: each point was assigned to the centroid that had minimum weighted distance. The weight was related with the number of points in each cluster, and the parameter of weighted distance was optimized by maximizing the between-cluster variation. Experimental results showed that the improved K-means algorithm significantly enhanced the clustering quality by reducing the probability of larger cluster’s being broken.

Keywords:	K-means algorithm clustering between-cluster variation weighted distance
本文献已被 CNKI 万方数据等数据库收录！
	点击此处可从《山东大学学报(理学版)》浏览原始摘要信息
	点击此处可从《山东大学学报(理学版)》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏