首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于分类精度和相关性的随机森林算法改进研究
引用本文:王日升,谢红薇,安建成.基于分类精度和相关性的随机森林算法改进研究[J].科学技术与工程,2017,17(20).
作者姓名:王日升  谢红薇  安建成
作者单位:太原理工大学,太原理工大学,太原理工大学
基金项目:国家高技术研究发展计划(863计划);山西省国际科技合作项目
摘    要:为了提升传统随机森林算法的分类精度,本文首先对传统随机森林模型中的决策树根据分类性能评价指标AUC值进行降序排列,从中选取出AUC值高的决策树,计算这些决策树之间的相似度并生成相似度矩阵,然后根据相似度矩阵对这些决策树进行聚类,从每一类中选出一棵AUC最大的决策树组成新的随机森林模型,从而达到提升传统随机森林算法分类精度的目的。通过UCI数据集的实验表明,改进后的随机森林算法分类精度上最大提高了2.91%。

关 键 词:随机森林    分类精度    决策树相似度    相似度矩阵
收稿时间:2017/1/5 0:00:00
修稿时间:2017/2/28 0:00:00

Research on Improvement of Random Forests Algorithm Based on Classification Accuracy and Correlation
Wang Ri Sheng,and An Jian Cheng.Research on Improvement of Random Forests Algorithm Based on Classification Accuracy and Correlation[J].Science Technology and Engineering,2017,17(20).
Authors:Wang Ri Sheng  and An Jian Cheng
Institution:Tai Yuan University of Technology,,Tai Yuan University of Technology
Abstract:In order to improve the classification accuracy of random forests algorithm, the decision trees in the random forest model are first sorted according to the AUC value of the classification performance evaluation index. And then the trees with high AUC value is selected to calculate the similarity matrix. Finally the decision tree is clustered according to the similarity matrix. So a new random forest model is generated by selecting the tree with the highest AUC value from each category and to achieve the goal of improving the accuracy of random forests algorithm. Experiments on UCI datasets show that the improved random forest algorithm has improved the highest classification accuracy of 2.91%.
Keywords:random forest  classification accuracy  the similarity among decision trees  similarity matrix
点击此处可从《科学技术与工程》浏览原始摘要信息
点击此处可从《科学技术与工程》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号