首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于MapReduce的C-Cubing并行算法
引用本文:奚建清;游进国;汤德佑.基于MapReduce的C-Cubing并行算法[J].华南理工大学学报(自然科学版),2009,37(1).
作者姓名:奚建清;游进国;汤德佑
作者单位:华南理工大学计算机科学与工程学院;华南理工大学
摘    要:封闭立方体计算的主要任务是在生成一个数据单元时,判断其是否封闭。针对该问题,C-Cubing是新近提出的一种有效的方法,不同以往基于输出或基于元组的方法,它仅通过特定的度量,即封闭性度量,就可以判断出封闭单元。然而随着数据量的增加,C-Cubing的性能下降,因此它的并行算法还有待研究。本文提出基于MapReduce并行框架,采用C-Cubing对封闭立方体并行计算的方法,并在Hadoop上给予了实现。实验结果表明,本方案能够利用廉价的PC机器,有效提高了在较大数据集上计算封闭立方体的性能。

关 键 词:联机分析处理  并行计算  封闭立方体  MapReduce  Hadoop  
收稿时间:2008-4-7
修稿时间:2008-4-23

A parallel C-Cubing algorithm based on MapReduce
Jian-Qing XI Jin-Guo YOU De-You Tang.A parallel C-Cubing algorithm based on MapReduce[J].Journal of South China University of Technology(Natural Science Edition),2009,37(1).
Authors:Jian-Qing XI Jin-Guo YOU De-You Tang
Abstract:The main task of closed cubes computation is to check the closeness of a new data cell. To address this issue, C-Cubing is an efficient approach recently proposed. Unlike the previous output-based and tuple-based method, it checks the closeness only by the closeness measure. However, with the size of data sets increasing, the performance of C-Cubing becomes worse, so the parallel C-Cubing algorithm needs to be studied. This paper presents a solution where C-Cubing is parallelized based on the MapReduce framework. We give an implementation using Hadoop. The experiments show that by employing low cost PC machines, our solution efficiently improves the performance for computing the closed cube on large data sets.
Keywords:OLAP  parallel computation  closed cube  MapReduce  Hadoop
点击此处可从《华南理工大学学报(自然科学版)》浏览原始摘要信息
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号