首页 | 本学科首页   官方微博 | 高级检索  
     

基于信息熵的懒散关联分类方法
引用本文:黄再祥,何田中,全秀祥,郑艺峰. 基于信息熵的懒散关联分类方法[J]. 漳州师院学报, 2013, 0(3): 40-44
作者姓名:黄再祥  何田中  全秀祥  郑艺峰
作者单位:闽南师范大学计算机科学与工程系,福建漳州363000
基金项目:国家自然科学基金资助项H(61170129);福建省自然科学基金资助项目(2013J01259)
摘    要:懒散关联分类针对每个待分类实例的特征进行分类关联规则的挖掘,通常能取得较高的准确率。然而,由于某些数据集中存在一些质量不好的特征,将影响懒散关联分类的准确率。此外,分类耗时较长是懒散关联分类另一个缺点。针对上述问题,提出了一种基于信息熵的懒散关联分类算法。该算法以信息熵度量属性值的质量,仅选取每个待分类实例中最好的k个属性值,将得到规模较小且与待分类实例紧密相关的训练子集,从中高效挖掘到高质量的规则。实验表明,与懒散关联分类相比,基于信息熵的懒散关联分类方法提高了分类准确率,并极大减少了运行时间。

关 键 词:数据挖掘  关联分类  懒散分类  信息熵

Lazy Associative Classification Based on Information Entropy
HUANG Zai-xiang,HE Tian-zhong,QUAN Xiu-xiang,ZHENG Yi-feng. Lazy Associative Classification Based on Information Entropy[J]. Journal of ZhangZhou Teachers College(Philosophy & Social Sciences), 2013, 0(3): 40-44
Authors:HUANG Zai-xiang  HE Tian-zhong  QUAN Xiu-xiang  ZHENG Yi-feng
Affiliation:(Department of Computer Science and Engineering, Minnan Normal University, Fujian, Zhangzhou 363000, China)
Abstract:Lazy associative classification (LAC) usually achieves high accuracy by focusing on the features of the given test instance. However, the accuracy of LAC is high sensitivity to low quality features. Another disadvantage is that LAC typically consumes more time to classify all test instances. To address these problems, Lazy Associative Classification based on Information Entropy (called ELAC) is proposed in this paper. ELAC use information entropy to measure attribute values and the best k attribute values in each test instance are selected. As a result, a small subset which is high relevant to the test instance is produced from which high quality rules are efficiently minded. Experiments show that ELAC improves the classification accuracy and significantly decreases the test time.
Keywords:data mining  associative classification  Lazy classification  information entropy
本文献已被 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号