首页 | 本学科首页   官方微博 | 高级检索  
     检索      

分布式全局频繁项目集的快速挖掘方法
引用本文:宋宝莉,覃征.分布式全局频繁项目集的快速挖掘方法[J].西安交通大学学报,2006,40(8):923-927.
作者姓名:宋宝莉  覃征
作者单位:1. 西安交通大学计算机科学与技术系,710049,西安;深圳市劳动保障局,518029,深圳
2. 西安交通大学计算机科学与技术系,710049,西安
摘    要:针对传统的分布式全局频繁项目集挖掘算法存在大量的候选项目集,且求全局频繁项目集的网络通信代价过高等问题,提出了一种分布式数据库的全局频繁项目集快速挖掘算法(FDMA).该算法改进了频繁模式树(FP-树)的结构,将双向FP-树改为单向,每个节点只保留指向父结点的指针,减少了指针数,由此可节省1/3的树空间;同时通过传送用3个很小的数组表示的被约束子树,在此挖掘全局频繁项目集的过程中不再生成大量候选项目集或条件FP-树,从而减小了网络通信量,提高了挖掘效率.实验表明,所提算法的挖掘速度比传统的分布式数据库数据挖掘算法至少提高了1倍之多,随着数据库规模的增大,它的扩展性将更好.

关 键 词:数据挖掘  分布式数据库  全局频繁项目集  被约束子树
文章编号:0253-987X(2006)08-0923-05
收稿时间:2006-02-17
修稿时间:2006年2月17日

Fast Mining Algorithm for Distributed Global Frequent Itemsets
Song Baoli,Qin Zheng.Fast Mining Algorithm for Distributed Global Frequent Itemsets[J].Journal of Xi'an Jiaotong University,2006,40(8):923-927.
Authors:Song Baoli  Qin Zheng
Abstract:Aiming at the problem that there exists prolific candidate sets and the communication cost is high for obtaining global frequent itemsets in distributed mining algorithms,an algorithm FDMA(fast distributed mining algorithm) for mining global frequent itemsets in distributed database is proposed.It improves the structure of frequent pattern tree(FP-tree) and the bidirectional FP-tree is transformed into unilateral one and only the pointer that points to the parent node is reserved at each node.The number of the pointer is decreased and one third of tree space can be saved.Meanwhile,by transmitting constrained sub-trees(consisting of three small arrays),the proposed algorithm doesn't generate conditional FP-tree or a large number of candidate sets in mining process,and thus the network traffic is reduced and the mining efficiency is increased.Experiments show that in comparison with DDDM(distributed dual decision miner) algorithm,the proposed algorithm increases the mining speed by at least two times.Moreover,the algorithm has better extensibility with increasing the scale of database.
Keywords:data mining  distributed database  global frequent itemset  constrained sub-tree
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号