首页 | 本学科首页   官方微博 | 高级检索  
     

NPSP:一种高效的序列模式增量挖掘算法
引用本文:张兵,聂永红,林士敏. NPSP:一种高效的序列模式增量挖掘算法[J]. 广西师范大学学报(自然科学版), 2004, 22(4): 22-26
作者姓名:张兵  聂永红  林士敏
作者单位:江苏行政学院,现代科技部,江苏,南京,210004;广西工学院,计算机工程系,广西,柳州,545006;广西师范大学,数学与计算机科学学院,广西,桂林,541004
基金项目:澳大利亚ARC基金资助项目(DP0343109)
摘    要:提出了一种称为“异构树”的数据结构,采用一套编号规则对异构树的分支进行编号,使具有相同编号的分支代表相同的候选序列,编号不同的分支代表不同的候选序列,极大地简化了候选集计数过程,在此基础上提出了具有增量挖掘功能的序列模式高效挖掘算法NPSP,并从理论分析和实验两方面证明了其挖掘结果集的完备性和算法的高效性.

关 键 词:数据挖掘  序列模式  NPSP算法  增量挖掘
文章编号:1001-6600(2004)04-0022-05

NPSP:AN EFFICIENT ALGORITHM WITH INCREMENTAL DATA MINING FOR MINING SEQUENTIAL PATTERNS
ZHANG Bing,NIE Yong-hong,LIN Shi-min. NPSP:AN EFFICIENT ALGORITHM WITH INCREMENTAL DATA MINING FOR MINING SEQUENTIAL PATTERNS[J]. Journal of Guangxi Normal University(Natural Science Edition), 2004, 22(4): 22-26
Authors:ZHANG Bing  NIE Yong-hong  LIN Shi-min
Affiliation:ZHANG Bing~1,NIE Yong-hong~2,LIN Shi-min~3
Abstract:The GSP and the PSP are the main two algorithms for mining sequential patterns.But neither of those algorithms has the function of incremental data mining and their efficiency is lower.In this paper,a data structure called Heterogeneity Tree is presented and a set of rules is used to number the branches of the Heterogeneity Tree.The rules ensure that the branches which have the same serial numbers represent the same candidates and the branches which have different serial numbers represent different candidates so that the process of counting the support of candidates is simplified.Based on those,an efficient algorithm with the function of incremental data mining for mining sequential patterns is obtained.Finally the completeness of the mined set and efficiency of the algorithm NPSP by theories and experiment are proved.
Keywords:data mining  sequence patterns  NPSP algorithm  incremental data mining
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号