首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于多置信度的不平衡数据分类算法
引用本文:何田中,黄再祥.基于多置信度的不平衡数据分类算法[J].漳州师院学报,2014(4):26-30.
作者姓名:何田中  黄再祥
作者单位:闽南师范大学计算机学院,福建漳州363000
基金项目:国家自然科学基金项目(61170129);福建省自然科学基金项目(2013J01259)
摘    要:传统的分类算法通常设置统一的最小置信度提取规则.如果训练数据集是不平衡的数据,统一置信度的分类算法在小类的准确率不高.本文提出了一种基于训练集类分布的多置信度不平衡数据分类算法CBMI.在CBMI算法中,根据训练数据中类的分布设置不同的最小置信度提取规则,小类置信度的临界值比大类置信度低.此外,算法CBMI综合三种度量选择“好”的属性值.实验结果表明,基于多置信度不平衡数据分类算法CB—MI提高了小类数据分类的正确率.

关 键 词:数据挖掘  分类  不平衡数据  小类

Classification Based on Multi-confidence for Imbalanced Data Set
HE Tian-zhong,HUANG Zai-xiang.Classification Based on Multi-confidence for Imbalanced Data Set[J].Journal of ZhangZhou Teachers College(Philosophy & Social Sciences),2014(4):26-30.
Authors:HE Tian-zhong  HUANG Zai-xiang
Institution:(School of Computer Science, Minnan Normal University, Zhangzhou, Fujian 363000, China)
Abstract:Traditional classification usually uses the uniform minimum confidence to extract the rules. However, if the training data set is imbalanced, these algorithms based on the uniform minimum confidence will suffer from low accuracy in the minority class, This paper proposes a classification approach for the imbalanced data set called CBMI which sets multi minimum confidences based on the distribution of the class in training data set. For each class, CBM] uses different minimum confidences which associate with the distribution of the class to extract the rules. The minimum confidence of the minority class is lower than that one of the majority class. In addition, CBMI uses a measure which integrates three measurements to select the best attribute value. Experimental results show that CBMI improves the accuracy of the minority class.
Keywords:classification  imbalance data set  minority class
本文献已被 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号