首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于模式聚合理论的文本特征降维方法及其在文本分类中的应用
引用本文:李冠军,陈雪松,徐建锁.基于模式聚合理论的文本特征降维方法及其在文本分类中的应用[J].北京理工大学学报,2005,25(12):1087-1091.
作者姓名:李冠军  陈雪松  徐建锁
作者单位:天津大学,管理学院,天津,300072;北京科技大学,管理学院,北京,100083;河南电力公司,河南,郑州,450015
摘    要:根据模式聚合理论提出了一种文本特征降维的新方法.结合动态Kohonen网络理论检验了文本分类效果.在网络训练阶段引入了监督机制,提高了网络的分类速度和精度.应用模式聚合(PA)理论建立文本集的向量空间模型,从分类贡献的角度强化了词条的作用,消减了原词条矩阵中包含的冗余模式,有效地降低了向量空间的维数,提高了文本分类的精度和速度,并通过实验证明了该方法的泛化能力.

关 键 词:文本分类  模式聚合  Kohonen网络  向量空间模型
文章编号:1001-0645(2005)12-1087-05
收稿时间:03 12 2005 12:00AM
修稿时间:2005年3月12日

Method and Application of Decreasing Text Feature Based on Pattern Aggregation
LI Guan-jun,CHEN Xue-song and XU Jian-suo.Method and Application of Decreasing Text Feature Based on Pattern Aggregation[J].Journal of Beijing Institute of Technology(Natural Science Edition),2005,25(12):1087-1091.
Authors:LI Guan-jun  CHEN Xue-song and XU Jian-suo
Institution:1. School of Management, Tianjin University, Tianjin 300072, China; 2. School of Management, University of Beijing Science and Technology, Beijing 100083, China; 3. Henan Electricity Power Corporation, Zhengzhou, Henan 450015, China
Abstract:A new method of decreasing the dimension of feature vector by using the theory of pattern aggregation(PA) is presented.The method connected Kohonen network acquire better result of text categorization.The Kohonen network is applied to realize text classifying,and apply supervising method to network training.Therefore,the speed and the precision of classifying are improved.However,to the text vector of high dimension,the speed of classifying is still very slow using Kohonen network.Even the result of classifying cannot be acquired.The new method establishes vector space model of term weight by the theory of PA,which enhances the function of the words from the viewpoint of categorization effect,and decreases the dimension of vector by eliminating redundant features.Therefore the new method advances the speed and the precision of text categorization largely,and the method has better generalization ability,which is approved by the experimentation.
Keywords:text categorization  pattern aggregation  Kohonen network  vector space model
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《北京理工大学学报》浏览原始摘要信息
点击此处可从《北京理工大学学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号