首页 | 本学科首页   官方微博 | 高级检索  
     检索      


An enhanced text categorization method based on improved text frequency approach and mutual information algorithm
Authors:Pei Zhili  Shi Xiaohu  Maurizio Marchese  Liang Yanchun
Abstract:Text categorization plays an important role in data mining. Feature selection is the most important process of text categorization. Focused on feature selection, we present an improved text frequency method for filtering of low frequency features to deal with the data preprocessing, propose an improved mutual information algorithm for feature selection, and develop an improved tf.idf method for characteristic weights evaluation. The proposed method is applied to the benchmark test set Reuters-21578 Top10 to examine its effectiveness. Numerical results show that the precision, the recall and the value of F1 of the proposed method are all superior to those of existing conventional methods.
Keywords:
点击此处可从《自然科学进展》浏览原始摘要信息
点击此处可从《自然科学进展》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号