首页 | 本学科首页   官方微博 | 高级检索  
     检索      

Mining Frequent Closed Itemsets in Large High Dimensional Data
作者姓名:余光柱  曾宪辉  邵世煌
作者单位:College of Information Sciences and Technology,Donghua University,Department of Computer Science,College of Hubei Police Officer
基金项目:高等学校博士学科点专项科研基金
摘    要:Large high-dimensional data have posed great challenges to existing algorithms for frequent itemsets mining. To solve the problem, a hybrid method, consisting of a novel row enumeration algorithm and a column enumeration algorithm, is proposed. The intention of the hybrid method is to decompose the mining task into two subtasks and then choose appropriate algorithms to solve them respectively. The novel algorithm, i.e., Intertransaction is based on the characteristic that there are few common items between or among long transactions. In addition, an optimization technique is adopted to improve the performance of the intersection of bit-vectors. Experiments on synthetic data show that our method achieves high performance in large high-dimensional data.

关 键 词:频繁关闭系统  大空间数据  混合方法  计算机程序

Mining Frequent Closed Itemsets in Large High Dimensional Data
YU Guang-zhu,ZENG Xian-hui,SHAO Shi-huang.Mining Frequent Closed Itemsets in Large High Dimensional Data[J].Journal of Donghua University,2008,25(4):416-424.
Authors:YU Guang-zhu  ZENG Xian-hui  SHAO Shi-huang
Institution:1. College of Information Sciences and Technology,Donghua University,Shanghai 201620,China;Department of Computer Science,College of Hubei Police Officer,Wuhan 430034,China
2. College of Information Sciences and Technology,Donghua University,Shanghai 201620,China
Abstract:Large high-dimensional data have posed great challenges to existing algorithms for frequent itemsets mining.To solve the problem,a hybrid method,consisting of a novel row enumeration algorithm and a column enumeration algorithm,is proposed.The intention of the hybrid method is to decompose the mining task into two subtasks and then choose appropriate algorithms to solve them respectively.The novel algorithm,i.e.,Inter-transaction is based on the characteristic that there are few common items between or among long transactions.In addition,an optimization technique is adopted to improve the performance of the intersection of bit-vectors.Experiments on synthetic data show that our method achieves high performance in large high-dimensional data.
Keywords:frequent closed Itemsets  large high-dimensional data  row enumeration  column enumeration  hybrid method
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号