首页 | 本学科首页   官方微博 | 高级检索  
     

基于连接的频繁集聚类算法
引用本文:王波,钱晓棠,张斌,张明卫. 基于连接的频繁集聚类算法[J]. 辽宁工程技术大学学报(自然科学版), 2005, 24(Z2): 150-152
作者姓名:王波  钱晓棠  张斌  张明卫
作者单位:东北大学,信息科学与工程学院,沈阳,110004
基金项目:国家科技部"十五"攻关项目(2004BA72lA05)
摘    要:
针对大型事务数据库中频繁集的多属性聚类问题,提出一种高效的频繁集聚类算法.以往聚类算法采用基于距离的计算方法,由于受到属性数据的制约,在频繁集挖掘中具有一定的限制.在属性聚类基础上,基于连接对频繁集进行聚类.在算法中先找出数据点的邻居和计算相似度,构造邻居矩阵;然后计算连接数目,确定邻居数目矩阵;最后通过设置判定函数和阈值确定聚类数.通过实验证明,算法能够不仅能有效地完成频繁集的多属性聚类问题,而且还可以进一步发现频繁集在某一层次的相关性.

关 键 词:聚类  频繁集  相似度矩阵  邻居  连接
文章编号:1008-0562(2005)增刊-0150-03

Algorithm of frequent item sets clustering based on link
WANG Bo,QIAN Xiao-Tang,ZHANG Bin,ZHANG Ming-wei. Algorithm of frequent item sets clustering based on link[J]. Journal of Liaoning Technical University (Natural Science Edition), 2005, 24(Z2): 150-152
Authors:WANG Bo  QIAN Xiao-Tang  ZHANG Bin  ZHANG Ming-wei
Abstract:
An efficient frequent item sets clustering algorithm is proposed for multiattribute clustering in large business database.However some previous algorithms compute based on distance.Because algorithms are restricted by attribute values,they are limited in frequent item sets mining.Based on attribute clustering,frequent item sets are clustered by link.The algorithm first finds neighbor and computes similarity,building neighbor matrix.Then,it computes number of link,creating neighbor number matrix.Finally,number of clustering is confirmed by function and threshold.The experiment has proved that the algorithm can not only effectively implement multiattribute clustering for frequent item sets,but also find association on a level.
Keywords:clustering  frequent item set  similarity matrix  neighbor  link
本文献已被 CNKI 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号