首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于统计学法则的连续属性值划分方法
引用本文:高洪涛,陆 伟,杨余旺.基于统计学法则的连续属性值划分方法[J].科学技术与工程,2018,18(16).
作者姓名:高洪涛  陆 伟  杨余旺
作者单位:中国刑事警察学院网络犯罪侦查系;南京理工大学计算机科学与工程学院
基金项目:国家科技支撑计划(2007BAK34B03),国家自然科学基金项目(No. 61640020);
摘    要:目前决策树中很多分类算法例如ID3/C4.5/C5.0等都依赖于离散的属性值,并且希望将它们的值域划分到一个有限区间。利用统计学法则,提出一种新的连续属性值的划分方法;该方法通过统计学法则来发现精准的合并区间。另外在此基础上,为提高决策树算法分类学习性能,提出一种启发式的划分算法来获得理想的划分结果.在UCI真实数据集上进行仿真实验.结果表明获得了一个比较高的分类学习精度、与常见的划分算法比较起来有很好的分类学习能力。

关 键 词:连续属性值    学习精度  统计学法则  分类算法
收稿时间:2017/12/9 0:00:00
修稿时间:2018/3/6 0:00:00

A new Partition approach for Continuous Attributes Based on Statistical Criterion
Institution:Department of Cyber Crime Investigation,Criminal Investigation Police University of China,,School of Computer Science and Engineering, Nanjing University of Science and Technology
Abstract:Many classification algorithms such as ID3/C4.5/C5.0 decision tree algorithms rely on discrete attributes and need to quantify continuous attributes into a finite number of intervals.In this paper, a new data partition method for continuous attributes was presented.This approach used a statistical criterion to discover the accurate discrete intervals which was required to merge.In order to promote the classification performance of decision tree algorithm,a heuristic algorithm was also discussed to gain excellent the quantify results.A serials of simulation had been done using UCI data sets.The experiments results and performance analysis show our approach is a good partition model,C4.5 decision tree classification algorithm can benefit a lot from our method.
Keywords:continuous attributes  Learning ?accuracy  statistical criterion  classification algorithms
本文献已被 CNKI 等数据库收录!
点击此处可从《科学技术与工程》浏览原始摘要信息
点击此处可从《科学技术与工程》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号