基于稀疏指数排序的高维数据并行聚类算法 Parallel clustering algorithm based on sparse index sort of high dimensional data期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于稀疏指数排序的高维数据并行聚类算法

引用本文：	武森,冯小东,吴庆海. 基于稀疏指数排序的高维数据并行聚类算法[J]. 系统工程理论与实践, 2011, 0(Z2): 13-18

作者姓名：	武森冯小东吴庆海

作者单位：	北京科技大学经济管理学院;

基金项目：	国家自然科学基金(70771007); 中央高校基本科研业务费专项资金(FRF-TP-10-006B)

摘要：	高维数据聚类是数据挖掘领域的重要研究课题,大规模高维数据聚类研究非常具有挑战性.针对高效的CABOSFV高维数据聚类算法,采用并行计算模式提高其大规模数据的处理能力,提出基于稀疏指数排序的高维数据并行聚类算法P-CABOSFV.该算法根据高维数据稀疏指数排序进行分割点选择实现数据划分,将数据分配到多个计算节点同时处理聚类任务,再基于集合稀疏特征差异度聚类结果合并策略将各计算节点的聚类结果合并得到最终聚类结果.UCI数据集和计算机合成数据集实验表明:高维数据并行聚类算法P-CABOSFV聚类质量良好,具有很强的数据规模和数据维度可扩展性,是有效可行的.
关键词：	稀疏指数数据划分高维数据聚类并行计算
Parallel clustering algorithm based on sparse index sort of high dimensional data

WU Sen,FENG Xiao-dong,WU Qing-hai. Parallel clustering algorithm based on sparse index sort of high dimensional data[J]. Systems Engineering —Theory & Practice, 2011, 0(Z2): 13-18

Authors:	WU Sen FENG Xiao-dong WU Qing-hai

Affiliation:	WU Sen,FENG Xiao-dong,WU Qing-hai (School of Economics and Management,University of Science and Technology Beijing,Beijing 100083,China)

Abstract:	High dimensional data clustering is an important research subject of data mining.Large scale high dimensional data clustering is more challenging.A parallel algorithm P-CABOSFV is presented based on sparse index sort by extending CABOSFV,an efficient high dimensional data clustering algorithm and by using parallel computing pattern to improve the capability to deal with large scale data.The proposed algorithm partitions data according to high dimensional data sparse index sort,distributes the segmented data...

Keywords:	sparse index data partition high dimensional data clustering parallel computing
本文献已被 CNKI 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏