首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于抽样和规则的不平衡数据关联分类方法
引用本文:杨光飞,崔雪娇,张翔.基于抽样和规则的不平衡数据关联分类方法[J].系统工程理论与实践,2017,37(4):1035-1045.
作者姓名:杨光飞  崔雪娇  张翔
作者单位:大连理工大学 系统工程研究所, 大连 116024
摘    要:不平衡数据的出现给传统关联分类算法带来了巨大的挑战.为了提高关联分类方法对不平衡数据集的分类精度,本文分别从数据和规则层次着手,提出了关键值抽样法(key value sampling,KVS)和规则验证法(rule validation,RV).关键值抽样法通过增加与少数类相关性强的数据,减少与多数类相关性弱的数据来达到数据类分布平衡.避免了大量有效信息的流失,并且增强了与少数类相关性强的数据信息.规则验证法对初步生成的分类器进行了规则验证,并对分类性能不好的规则进行调整,从而保证了分类器中规则的质量.实验表明,本文中的研究方法能够有效提高关联分类方法处理不平衡数据的精度.

关 键 词:关联分类方法  不平衡数据  关键值抽样法  规则验证法  
收稿时间:2015-10-08

Sample and rule centric approach for associative classification on imbalanced data
YANG Guangfei,CUI Xuejiao,ZHANG Xiang.Sample and rule centric approach for associative classification on imbalanced data[J].Systems Engineering —Theory & Practice,2017,37(4):1035-1045.
Authors:YANG Guangfei  CUI Xuejiao  ZHANG Xiang
Institution:Institute of Systems Engineering, Dalian University of Technology, Dalian 116024, China
Abstract:The emergency of imbalanced data has brought a great challenge for AC method. To improve AC's performance on imbalanced data, this paper presents key value sampling (KVS) method and rule validation (RV) method respectively from data and rule processing. KVS samples the original imbalanced data and achieves class balance by removing the instances weakly correlated with majority class and increasing those strongly correlated with minority class, which can prevent a lot of useful information from losing and highlight the useful information related with minority class. RV method is to validate the initially generated classifier and improve the rules with bad performances, which can enhance the whole classifier's performance. Through experiment analysis, the methods in this paper can improve the performance of AC on imbalanced data classification.
Keywords:associative classification  imbalanced data  sample centric processing  rule centric processing
本文献已被 CNKI 等数据库收录!
点击此处可从《系统工程理论与实践》浏览原始摘要信息
点击此处可从《系统工程理论与实践》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号