首页 | 本学科首页   官方微博 | 高级检索  
     

一种基于核学习的非均衡数据分类算法
引用本文:钟瑛,朱顺痣,曾志强,洪文兴. 一种基于核学习的非均衡数据分类算法[J]. 厦门大学学报(自然科学版), 2012, 51(2): 189-194
作者姓名:钟瑛  朱顺痣  曾志强  洪文兴
作者单位:1. 厦门理工学院计算机科学与技术系,福建厦门,361024
2. 厦门大学信息科学与技术学院,福建厦门,361005
基金项目:国家自然科学基金项目(61070151);福建省教育厅A类科技项目(JA11241)
摘    要:提出一种基于核学习的采样算法来处理支持向量机(support vector machine,SVM)在非平衡数据集上的分类问题,其核心思想是首先在核空间中对少数类样本进行上采样,然后通过输入空间和核空间的距离关系寻找所合成样本在输入空间的原像,最后再采用SVM对其进行训练,从而有效克服在不同空间处理训练样本所带来的数据不一致问题.另一方面,该算法在增加少数类样本数量,减小数据失衡程度的同时有效拓展了少数类样本所形成的凸壳,从而能够更为有效纠正最优分类超平面偏移问题,使获得的结果分类器具有更好的泛化性能,实验结果证明了该算法的高效性.

关 键 词:非平衡数据集  核学习  凸壳  原像

A Classfication Method for Imbalance Dataset Based on Kernel Learning
ZHONG Ying , ZHU Shun-zhi , ZENG Zhi-qiang , HONG Wen-xing. A Classfication Method for Imbalance Dataset Based on Kernel Learning[J]. Journal of Xiamen University(Natural Science), 2012, 51(2): 189-194
Authors:ZHONG Ying    ZHU Shun-zhi    ZENG Zhi-qiang    HONG Wen-xing
Affiliation:1.Department of Computer Science and Technology,Xiamen University of Technology,Xiamen 361024,China; 2.School of Information Science and Technology,Xiamen University,Xiamen 361005,China)
Abstract:This paper presents a sample approach based on kernel learning to solve classification on imbalance data set by SVM.The method first preprocesses the data by oversampling the minority class in feature space,and then the pre-images of the synthetic samples are found based on a distance relation between feature space and input space.Finally,these pre-images are appended to the original data set to train a SVM.Experiments on real data sets indicate that compared to SMOTE approach,the samples constructed by the proposed method have the higher quality.As a result,the effectiveness of classification by SVM on imbalance data set is improved.On the other hand,the paper also analyzes theoretically approximation of quadratic programming corresponding to SMOTE connecting with SVM methods and Biased SVM,which contributes to the research of classification on imbalance data set by this type of methods.
Keywords:imbalance data set  kernel learning  convex hull  pre-image
本文献已被 CNKI 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号