首页 | 本学科首页   官方微博 | 高级检索  
     检索      

一种挖掘最大频繁子图的新算法
引用本文:王映龙,杨炳儒,宋泽嶂,陈卓,李琳娜.一种挖掘最大频繁子图的新算法[J].系统仿真学报,2008,20(18).
作者姓名:王映龙  杨炳儒  宋泽嶂  陈卓  李琳娜
作者单位:北京科技大学信息工程学院
摘    要:如何从大量的图中挖掘出令人感兴趣的子图模式已经成为数据挖掘领域研究的热点之一.由于其内在的计算复杂性,挖掘全部频繁子图非常困难,且得到的频繁子图过多,影响着结果的理解和应用.解决方案之一是挖掘最大频繁子图.在经典的Apriori算法的基础上,提出了一种挖掘最大频繁予图的新算法Apriori-MaxGraph.首先给出了一种新的、用于计算图的邻接矩阵规范编码的结点排序策略,大大降低了求图规范编码的复杂度,并可以加速子图规范编码序列匹配的速度.其次,针对最大频繁子图,对候选子图的生成进行了规范.最后,采用双向搜索与剪枝策略,大大减小了搜索空间,提高了算法的效率,实验结果表明,Apriori-MaxGraph算法具有较高的挖掘效率.

关 键 词:数据挖掘  最大频繁子图  邻接矩阵  规范编码

New Algorithm for Mining Maximal Frequent Subgraphs
WANG Ying-long,YANG Bing-ru,SONG Ze-feng,CHEN Zuo,LI Lin-na.New Algorithm for Mining Maximal Frequent Subgraphs[J].Journal of System Simulation,2008,20(18).
Authors:WANG Ying-long  YANG Bing-ru  SONG Ze-feng  CHEN Zuo  LI Lin-na
Abstract:Frequent subgraph mining is an active research topic in the data mining community. Because of the inherent computational complexity, mining the complete frequent subgrahps remains to be a challenging task. Furthermore, the number of discovered patterns is often too huge to understand and apply. Mining Maximal Frequent Subgraphs is an alternative to address the problem. Based on the classical Apriori algorithm, Apriori-MaxGraph, which was a maximal subgraph mining algorithm, was proposed. Firstly, to lower the complexity of computing canonical code of adjacency matrix of graph, a new vertex sorting strategy was introduced. Meanwhile, the sorting strategy could speed the matching process of sequences of canonical codes. Secondly, aiming at maximal frequent subgraphs, the process of candidate generation was standardized. Finally, bi-directional searching and pruning were exploited. Thus, the search space was reduced greatly, and the efficiency was improved. Experimental results show the proposed algorithm is efficient.
Keywords:data mining  maximal frequent subgraph  adjacency matrix  canonical code
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号