首页 | 本学科首页   官方微博 | 高级检索  
     

基于支持向量机的不平衡样本分类研究
引用本文:丁福利,孙立民. 基于支持向量机的不平衡样本分类研究[J]. 科学技术与工程, 2014, 14(3)
作者姓名:丁福利  孙立民
作者单位:烟台大学 计算机学院 烟台 264005,烟台大学 计算机学院 烟台 264005
基金项目:山东省自然科学基金(2009ZRB019CE)
摘    要:分类问题是机器学习领域的重要研究方向之一。支持向量机是一种基于结构风险最小化的学习机器,在解决分类问题上有着出色的效果。但基于支持向量机的分类器在处理不平衡样本时,对少类样本分类准确率偏低。诸多研究在对此问题做分析时往往把主要原因归结为各类样本间数量上的不平衡,而没有充分考虑样本点在特征空间上的分布情况。针对此问题做出原因分析,并给出结论:样本的不平衡性主要是由特征空间下各类样本的分布所决定的,而和数量上的不平衡关系较小。通过实验验证结论的科学有效性。

关 键 词:支持向量机  不平衡样本集  特征空间  样本分布
收稿时间:2013-08-09
修稿时间:2013-09-13

Unbalanced sample set classification based on support vector machine
Ding Fu li and Sun Li min. Unbalanced sample set classification based on support vector machine[J]. Science Technology and Engineering, 2014, 14(3)
Authors:Ding Fu li and Sun Li min
Affiliation:School of Computer Science,Yantai University
Abstract:Classification is an important field of machine learning.SVM is a learning machine based on structural risk minimization , it is very good at solving classification.However its classification accuracy for the minority class of the unbalance sample set is very low. Many researchers give their analysis on it, they often consider the problem is caused by the sample unbalance in quantity.They did not consider the distribution of sample points in the feature space. This paper analyzes the reasons for this problem,and gives the conclusion: the unbalance of classification accuracy is mainly determined by the sample distribution in the feature space, it has a smaller relationship with the imbalance in quantity. The experiment results validated our conclusion.
Keywords:support vector machine   unbalanced sample set   feature space   sample distribution
本文献已被 CNKI 等数据库收录!
点击此处可从《科学技术与工程》浏览原始摘要信息
点击此处可从《科学技术与工程》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号