首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于属性间交互信息的预剪枝ID3算法
引用本文:韩义亭,王力,刘小军,张成宇.基于属性间交互信息的预剪枝ID3算法[J].贵州大学学报(自然科学版),2008,25(5).
作者姓名:韩义亭  王力  刘小军  张成宇
作者单位:1. 贵州大学电子科学与信息技术学院,贵州,贵阳,550025
2. 中国矿业大学(北京)资源与安全工程学院,北京,100083
摘    要:ID3算法是决策树归纳中普遍而有效的启发式算法.本文针对ID3算法的不足,给出了一个改进版本,它在选择测试属性时不仅要求该属性和类的交互信息较大,而且要求和祖先结点使用过的属性之间的交互性息尽可能小,从而避免了对冗余属性的选择,实现信息熵的真正减少.在生成树的过程中,设定分类阈值,对树进行剪枝,以避免数据子集过小,使进一步划分失去统计意义.实验结果表明,该算法能构造出比ID3算法更优的决策树.

关 键 词:ID3  交互信息  预剪枝

An Pre-pruning ID3 Algorithm Based on the Mutual Information between Attributes
HAN Yi-ting,WANG Li,LIU Xiao-jun,ZHANG Cheng-yu.An Pre-pruning ID3 Algorithm Based on the Mutual Information between Attributes[J].Journal of Guizhou University(Natural Science),2008,25(5).
Authors:HAN Yi-ting  WANG Li  LIU Xiao-jun  ZHANG Cheng-yu
Institution:HAN Yi-ting1,WANG Li1,LIU Xiao-jun2,ZHANG Cheng-yu1 (1.Electronic science , information technology of Guizhou University,Guiyang 550003,China,2.College of Resource , Safety Engineering of China University of Mining , Technology(Beijing) 100086,China)
Abstract:ID3 algorithm is a popular and efficient heuristic algorithm in decision tree induction.This paper analyzes the shortcomings of the ID3 algorithm and proposes an extended version in which the testing attributes is selected based on not only the more mutual information between a candidate attribute and the class but also the less mutual information between a candidate attribute and the attribute of its ancestor nodes,in order to avoid selecting the redundant attributes and achieve the real reduce in entropy....
Keywords:ID3 mutual  information  pre-pruning  
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号