首页 | 本学科首页   官方微博 | 高级检索  
     

Constructing Projection Frequent Pattern Tree for Efficient Mining
引用本文:Xiang Jian-wen,He Yan-xiang,Kokichi Futatsugi,Kong Wei-qiangSchool of Computer,Wuhan University,Wuhan 430072,Hubei,China,State Key Lab of Software Engineering,Wuhan University,Wuhan 430072,Hubei,China,School of Information Science,Japan Advanced Institute of Science and Technology,1 - 1 Asahidai,Tatsunokuchi,Ishikawa,923-1292,Japan. Constructing Projection Frequent Pattern Tree for Efficient Mining[J]. 武汉大学学报:自然科学英文版, 2003, 8(2): 351-357. DOI: 10.1007/BF02907210
作者姓名:Xiang Jian-wen  He Yan-xiang  Kokichi Futatsugi  Kong Wei-qiangSchool of Computer  Wuhan University  Wuhan 430072  Hubei  China  State Key Lab of Software Engineering  Wuhan University  Wuhan 430072  Hubei  China  School of Information Science  Japan Advanced Institute of Science and Technology  1 - 1 Asahidai  Tatsunokuchi  Ishikawa  923-1292  Japan
作者单位:Xiang Jian-wen,He Yan-xiang,Kokichi Futatsugi,Kong Wei-qiangSchool of Computer,Wuhan University,Wuhan 430072,Hubei,China;State Key Lab of Software Engineering,Wuhan University,Wuhan 430072,Hubei,China;School of Information Science,Japan Advanced Institute of Science and Technology,1 - 1 Asahidai,Tatsunokuchi,Ishikawa,923-1292,Japan
基金项目:the National Natural Science Foundation of China(90104005)
摘    要:Frequent Pattern mining plays an essential role in data mining. Most of the previous studies adopt an Apriori-like candidate set generation-and-test approach. However, candidate set generation is still costly, especially when there exist prolific patterns and/or long patterns.In this study, we introduce a novel frequent pattern growth (FP-growth) method, which is efficient and scalable for mining both long and short frequent patterns without candidate generation. And build a new projection frequent pattern tree (PFP-tree) algorithm on this study, which not only heirs all the advantages in the FP-growth method, but also avoids it's bottleneck in database size dependence when constructing the frequent pattern tree (FP-tree). Efficiency of mining is achieved by introducing the projection technique, which avoid serial scan each frequent item in the database, the cost is mainly related to the depth of the tree, namely the number of frequent items of the longest transaction in the database, not the sum of all


Constructing projection frequent pattern tree for efficient mining
Xiang Jian-wen,He Yan-xiang,Kokichi Futatsugi,Kong Wei-qiang. Constructing projection frequent pattern tree for efficient mining[J]. Wuhan University Journal of Natural Sciences, 2003, 8(2): 351-357. DOI: 10.1007/BF02907210
Authors:Xiang Jian-wen  He Yan-xiang  Kokichi Futatsugi  Kong Wei-qiang
Affiliation:(1) School of Computer, Wuhan University, 430072 Wuhan, Hubei, China;(2) State Key Lab of Software Engineering, Wuhan University, 430072 Wuhan, Hubei, China;(3) School of Information Science, Japan Advanced Institute of Science and Technology, 1-1 Asahidai, 923-1292 Tatsunokuchi, Ishikawa, Japan
Abstract:Frequent Pattern mining plays an essential role in data mining. Most of the previous studies adopt an Apriori-like candidate set generation-and-test approach. However, candidate set generation is still costly, especially when there exist prolific patterns and/or long patterns.In this study, we introduce a novel frequent pattern growth (FP-growth) method, which is efficient and scalable for mining both long and short frequent patterns without candidate generation. And build a new projection frequent pattern tree (PFP-tree) algorithm on this study, which not only heirs all the advantages in the FP-growth method, but also avoids it's bottleneck in database size dependence when constructing the frequent pattern tree (FP-tree). Efficiency of mining is achieved by introducing the projection technique, which avoid serial scan each frequent item in the database, the cost is mainly related to the depth of the tree, namely the number of frequent items of the longest transaction in the database, not the sum of all the frequent items in the database, which hugely shortens the time of tree-construction. Our performance study shows that the PFP-tree method is efficient and scalable for mining large databases or data warehouses, and is even about an order of magnitude faster than the FP-growth method.
Keywords:data mining  frequent patterns tree  frequent patterns growth  projection frequent pattern tree
本文献已被 CNKI 万方数据 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号