首页 | 本学科首页   官方微博 | 高级检索  
     检索      

滑动窗口下数据流完全加权最大频繁项集挖掘
引用本文:王少鹏,闻英友,赵宏.滑动窗口下数据流完全加权最大频繁项集挖掘[J].东北大学学报(自然科学版),2016,37(7):931-936.
作者姓名:王少鹏  闻英友  赵宏
作者单位:(1. 东北大学 信息科学与工程学院, 辽宁 沈阳110819; 2. 东北大学 医学影像计算教育部重点实验室, 辽宁 沈阳110819)
基金项目:国家自然科学基金资助项目(60903159,61173153,61402096); 中央高校基本科研业务费专项资金资助项目(N110818001,N100218001,N130504007,N120104001); 沈阳市科技计划项目(1091176-1-00); 国家高技术研究发展计划项目(2015AA016005).
摘    要:针对当前关于数据流加权最大频繁项集WMFI(weighted maximal frequent itemsets)的研究无法有效地处理频繁阈值和加权频繁阈值不一致情况下WMFI的挖掘问题,提出了完全加权最大频繁项集FWM FI(full w eighted maximal frequent itemsets)的概念.为了减少naive算法在处理滑动窗口下完全加权最大频繁项集挖掘时存在的冗余运算,提出了FWMFI-SW(FWMFI mining based on sliding window over data stream)算法.所提出的算法通过基于频繁约束条件的优化策略减少了naive算法中M ax W优化策略的无效调用次数;采用编辑距离比率作为WMFP-SW-tree的重构判别函数,可以有效减少该树的重构次数.实验结果表明FWMFI-SW算法是有效的,且比naive算法更有时间优势.

关 键 词:数据流  滑动窗口  编辑距离比率  加权最大频繁项集  重构判别函数  

Mining Full Weighted Maximal Frequent Itemsets Based on Sliding Window over Data Stream
WANG Shao-peng,WEN Ying-you,ZHAO Hong.Mining Full Weighted Maximal Frequent Itemsets Based on Sliding Window over Data Stream[J].Journal of Northeastern University(Natural Science),2016,37(7):931-936.
Authors:WANG Shao-peng  WEN Ying-you  ZHAO Hong
Institution:1.School of Information Science & Engineering, Northeastern University, Shenyang 110819, China; 2.Key Laboratory of Medical Image Computing, Ministry of Education, Northeastern University, Shenyang 110819, China.
Abstract:Aiming at the problem that none of current researches on the WMFI (weighted maximal frequent itemsets) over data stream emphasizes the WMFI mining on the condition that the frequent threshold is not equal with the weighted frequent threshold, the concept of FWMFI (full weighted maximal frequent itemsets) was firstly promoted in this work. In order to reduce redundant operations existing in the naive algorithm which is used to handle the FWMFI mining based on sliding window over data stream, the FWMFI-SW (FWMFI mining based on sliding window over data stream) algorithm was proposed. The mining optimization strategy was adopted based on the frequent character to reduce the unnecessary call about the MaxW optimization strategy in the naive algorithm. In addition, the edit distance ratio was taken as reconstruction judge function to decide whether the updated WMFP-SW-tree should be reconstructed as the window slides. The extensive experiments showed that the FWMFI-SW algorithm is effective , and outperforms the naive algorithm in running time.
Keywords:data stream  sliding window  edit distance ratio  weighted maximal frequent itemsets  reconstruction judge function  
本文献已被 CNKI 等数据库收录!
点击此处可从《东北大学学报(自然科学版)》浏览原始摘要信息
点击此处可从《东北大学学报(自然科学版)》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号