首页 | 本学科首页   官方微博 | 高级检索  
     检索      

大型数据库中的高效序列模式增量式更新算法
引用本文:邹翔,张巍,蔡庆生,王清毅.大型数据库中的高效序列模式增量式更新算法[J].南京大学学报(自然科学版),2003,39(2):165-171.
作者姓名:邹翔  张巍  蔡庆生  王清毅
作者单位:中国科技大学计算机系,合肥230027
基金项目:国家自然科学基金(70171052,60075015)
摘    要:提出一种称为FIMS(fast incremental mining of sequential patterns)的序列模式增量式更新算法,处理因数据库的更新而引起的序列模式的维护问题。主要思想是利用原先的序列模式挖掘结果,通过建立一个投影数据库来减少对整个数据库的扫描次数和侯选序列的生成,从而提高挖掘的效率。实验结果显示在更新数据量远小于整个数据库的大小时,FIMS算法的性能优于GSP算法4-7倍。

关 键 词:数据库  增量式更新算法  数据挖掘  序列模式  扫描次数  侯选序列

An Efficient Incremental Updating Algorithm for Discovering Sequential Patterns in Large Database
Zou Xiang,Zhang Wei,Cai Qing_Sheng,Wang Qing_Yi.An Efficient Incremental Updating Algorithm for Discovering Sequential Patterns in Large Database[J].Journal of Nanjing University: Nat Sci Ed,2003,39(2):165-171.
Authors:Zou Xiang  Zhang Wei  Cai Qing_Sheng  Wang Qing_Yi
Abstract:An incremental updating technique for discovering sequential patterns called FIMS (fast incremental mining of sequential patterns) is proposed in order to deal with the maintenance of discovered sequential patterns resulted from the updating of database. The main idea is to utilize the results acquired during an earlier mining process to cut down on the cost of finding new sequential patterns in the updated database. Firstly, scan the whole database which is composed of the original database and the incremental database twice and construct a projected database from the whole database. Then, mine the projected database to get all the new candidate sequential patterns. lastly, scan the whole database once to get all the new sequential patterns. Since the algorithm FIMS only needs to scan the whole database three times in all and the projected database is much smaller than the whole database, the scan of the database and the growth of candidate sequences are greatly reduced. As a result, the efficiency of mining is improved. Our experiments show that the algorithm FIMS is greatly outperforming the algorithm GSP by a factor of 4 to 7 when the amount of the updated data is only a small portion of the whole database.
Keywords:data mining  sequential pattern  incremental updating
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号