首页 | 本学科首页   官方微博 | 高级检索  
     检索      

数据挖掘中一种基于集合覆盖的元素重要性估计算法
引用本文:薛若雯,陈刚.数据挖掘中一种基于集合覆盖的元素重要性估计算法[J].科学技术与工程,2013,13(33).
作者姓名:薛若雯  陈刚
作者单位:苏州卫生职业技术学院教务处,中南大学信息科学与工程学院
基金项目:基金中文完整名称(号)资助
摘    要:对给定数据集合的元素重要性进行估计是数据挖掘领域中的一项重要应用。现有的技术都是通过排序或选择来发现重要元素,其主要缺点是没考虑高排名对象可能非常相似甚至完全相同这一事实,忽略了高排名对象间的冗余性。因此,在强调多样性的场合,该方法性能有限。本文通过将排序和选择相结合,提出一种基于集合覆盖的元素重要性估计算法。该算法不仅考察单个集合覆盖的解,而且计算元素参与的高质量集合覆盖数量,进而为元素分配重要性分值。基于实际数据的实验和用户学习结果表明,本文算法性能高效,元素重要性评估结果的有用性高,且与人类感知相一致。

关 键 词:数据挖掘  元素重要性  排序  选择  集合覆盖  分值
收稿时间:7/8/2013 12:00:00 AM
修稿时间:2013/7/24 0:00:00

An Importance of Elements Estimation Algorithm Based on Set-Cover in Data Mining
Xue Ruo-wen and Chen Gang.An Importance of Elements Estimation Algorithm Based on Set-Cover in Data Mining[J].Science Technology and Engineering,2013,13(33).
Authors:Xue Ruo-wen and Chen Gang
Institution:The School of Information Science Engineering,Central South University,Changsha
Abstract:For a given set of data elements, the importance estimation is an important application in data mining. The existing work identifies important entities either by ranking or by selection. The major shortcoming of such approaches is that they ignore the redundancy between high-ranked entities, which may in fact be very similar or even identical. Therefore, in scenarios where diversity is desirable, such methods perform poorly. In this paper, by the combination of the sort and selection, we propose an importance of elements estimation algorithm based on the set cover, Instead of looking at a single set-cover solution, our algorithm computes the importance of entities by counting of the number of good set covers an entity participates. In a user study and an experimental evaluation on real data, we demonstrate that our framework is efficient and provides useful and intuitive results.
Keywords:Data mining  the importance of elements  sort  selection  set-cover  value
点击此处可从《科学技术与工程》浏览原始摘要信息
点击此处可从《科学技术与工程》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号