首页 | 本学科首页   官方微博 | 高级检索  
     检索      

一种基于模糊VSM和神经网络的文本分类方法
引用本文:潘俊辉.一种基于模糊VSM和神经网络的文本分类方法[J].科学技术与工程,2011,11(9).
作者姓名:潘俊辉
作者单位:东北石油大学,大庆,163318
摘    要:针对文本自动分类时可能存在一个文本属于多类的问题,提出了一种基于模糊向量空间模型和神经网络的文本自动分类方法。该方法采用模糊集理论,把特征项在文档中出现的位置作为反映文档主题的重要程度(隶属度),并在特征提取时充分考虑该位置信息,从而构造出模糊特征向量,使文本分类更接近手工分类方法。建立的网络由输入层、隐含层和输出层组成,其中输入层完成分类样本的输入,隐含层提取输入样本所隐含的模式特征,输出层用于输出分类结果。实验部分以万方数据库中部分文档数据为例验证了该方法的有效性。

关 键 词:文本分类  模糊向量空间  神经网络  模糊特征向量  特征提取  隶属度
收稿时间:1/3/2011 8:03:54 PM
修稿时间:1/11/2011 6:27:11 PM

A Kind of Text Classification Method Based on Fuzzy Vector Space Model
Pan Jun-Hui.A Kind of Text Classification Method Based on Fuzzy Vector Space Model[J].Science Technology and Engineering,2011,11(9).
Authors:Pan Jun-Hui
Institution:PAN Jun-hui WANG Hui (Dept.of Computer Science,Northeast Petroleum University,Daqing 163318,P.R.China)
Abstract:A kind of text classification method based on fuzzy vector space model and neural networks is proposed in the paper according to the problems that a text can be belongs to many types during the text classification. Fuzzy theory is adopted in the method to look the occuring position of feature items in text on as the important degree(membership) reflecteing text subject, and fully considered the position information while the features are extracted , thus the fuzzy feature vectors are constructed, as a result, the text classification is close to the manual classification method. The established networks are constituted of input layer, hidden layer and output layer, the input layer completes the inputs of classification samples, hidden layer extracts the implicit pattern features of input samples, the output layer is used to output the classification results. Finally the effectiveness of this method is proved by some documents of WanFang data in experimental section.
Keywords:text classification  fuzzy vector space  neural networks  fuzzy feature vector  feature extracted  membership
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《科学技术与工程》浏览原始摘要信息
点击此处可从《科学技术与工程》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号