首页 | 本学科首页   官方微博 | 高级检索  
     检索      

一种基于改进信息增益特征选择的最大熵模型文本分类方法
引用本文:何明.一种基于改进信息增益特征选择的最大熵模型文本分类方法[J].西南师范大学学报(自然科学版),2019,44(3):113-118.
作者姓名:何明
作者单位:重庆工业职业技术学院 建筑工程与艺术设计学院, 重庆 401120
基金项目:重庆市社会科学规划项目(2017YBYS108);重庆工业职业技术学院校级重点项目(GZY201709-2B)
摘    要:针对传统信息增益(IG)特征选择算法忽略词频分布的缺陷,该文提出一种新的IG特征选择算法.该算法通过引入均衡比和类内词频位置参数,解决了传统IG算法忽略词频分布对分类的弱化问题,修正传统类内词频位置参数,提高特征选择算法的文本分类精度,并将该改进IG特征选择算法用于最大熵模型(ME)对文本进行分类.实验结果表明:该文所提方法在进行文本分类时F1值高于传统IG算法.该文方法的ME分类精度高于K最近邻KNN(K-Nearest Neighbor)算法,说明本文方法是可行的、有效的.

关 键 词:信息增益  均衡比  词频参数  最大熵模型
收稿时间:2018/4/3 0:00:00

A Maximum Entropy Model Text Classification Method Based on Improved Information Gain Feature Selection
HE Ming.A Maximum Entropy Model Text Classification Method Based on Improved Information Gain Feature Selection[J].Journal of Southwest China Normal University(Natural Science),2019,44(3):113-118.
Authors:HE Ming
Institution:Institute of Construction Engineering and Art Design, Chongqing Industry Polytechnic College, Chongqing 401120, China
Abstract:For the shortcomings of traditional information gain (IG) feature selection algorithm:ignoring word frequency distribution,a new IG feature selection algorithm is proposed in this paper.The algorithm introduces the equalization ratio and word frequency location parameters within class.The new algorithm solves the problem that the traditional IG algorithm ignores the word frequency distribution and modifies the position parameter of word frequency within class to improve the accuracy of text classification.At last,the improved IG feature selection algorithm is applied to maximum entropy model (ME) for text classification.Experimental results show that,Compared with the traditional IG algorithm,the F1 value of the proposed method in this paper is higher than the traditional IG algorithm in text classification.In addition,the ME classification accuracy of this method is higher than the KNN algorithm,which shows that this method is feasible and effective.
Keywords:information gain  equalization ratio  word frequency parameter  maximum entropy model
本文献已被 CNKI 等数据库收录!
点击此处可从《西南师范大学学报(自然科学版)》浏览原始摘要信息
点击此处可从《西南师范大学学报(自然科学版)》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号