首页 | 本学科首页   官方微博 | 高级检索  
     

k-means聚类算法的MapReduce并行化实现
引用本文:江小平,李成华,向文,张新访,颜海涛. k-means聚类算法的MapReduce并行化实现[J]. 华中科技大学学报(自然科学版), 2011, 39(Z1): 120-124
作者姓名:江小平  李成华  向文  张新访  颜海涛
作者单位:1. 中南民族大学电子信息工程学院,湖北武汉,430074
2. 华中科技大学计算机科学与技术学院,湖北武汉,430074
3. 中国移动通信集团湖北有限公司业务支撑中心,湖北武汉,430040
基金项目:中央高校基本科研业务费专项资金资助项目(CZY11002); 武汉市科技攻关项目(201110821229); 华中科技大学暨湖北省移动通信公司TD-SCDMA联合创新实验室创新基金资助项目
摘    要:针对k-means聚类算法特点,给出了MapReduce编程模型实现k-means聚类算法的方法,Map函数完成每个记录到聚类中心距离的计算并重新标记其属于的新聚类类别,Reduce函数根据Map函数得到的中间结果计算出新的聚类中心,供下一轮MapReduce Job使用.实验结果表明:k-means算法MapReduce并行化后部署在Hadoop集群上运行,具有较好的加速比和良好的扩展性.

关 键 词:云计算  并行计算  MapReduce模型  数据挖掘  k-means聚类算法

Parallel implementing k-means clustering algorithm using MapReduce programming mode
Jiang Xiaoping,Li Chenghua,Xiang Wen,Zhang Xinfang,Yan Haitao. Parallel implementing k-means clustering algorithm using MapReduce programming mode[J]. JOURNAL OF HUAZHONG UNIVERSITY OF SCIENCE AND TECHNOLOGY.NATURE SCIENCE, 2011, 39(Z1): 120-124
Authors:Jiang Xiaoping  Li Chenghua  Xiang Wen  Zhang Xinfang  Yan Haitao
Affiliation:Jiang Xiaoping1 Li Chenghua1 Xiang Wen2 Zhang Xinfang2 Yan Haitao3(1 College of Electronics and Information Engineering,South-Central University for Nationalities,Wuhan 430074,China,2 School of Computer Science and Technology,Huazhong University of Science and Technology,3 Business Support Center,China Mobile Group Hubei Co.Ltd.,Wuhan 430040,China)
Abstract:How to implement the k-means clustering algorithm using MapReduce programming mode was studied.The distance between each point and each cluster was calculated and new center ID was assigned to each point in the Map function.All the points of the same key value(current cluster ID) were sent to a single reducer and get the new cluster centroids for the next MapReduce Job.The experiments on the Hadoop platform showns basically linear speedup with an increasing number of node computers and good scalability.
Keywords:cloud computing  parallel computing  MapReduce programming mode  data mining  k-means clustering algorithm  
本文献已被 CNKI 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号