基于文本聚类的LSI文本分类模型 The Model of Text Categorization Based on Latent Semantic Indexing期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

基于文本聚类的LSI文本分类模型

引用本文：	邱志宇,安艳辉.基于文本聚类的LSI文本分类模型[J].河北师范大学学报(自然科学版),2012,36(1):24-26,83.

作者姓名：	邱志宇安艳辉

作者单位：	1. 河北师范大学数学与信息科学学院,河北石家庄,050024 2. 河北省工业和信息化厅,河北石家庄,050071

摘要：	文本自动分类是文本挖掘的基础,可广泛地应用于信息检索,web挖掘等领域.在分类前首先要将文本表示成计算机能处理的形式,提出了一种将隐含语义索引(LSI)与文本聚类相结合的中文文本自动分类的方法.在挖掘文本的语义信息,提高分类速度上均取得了较好的效果.通过实验验证了方法的有效性.
关键词：	文本分类隐含语义检索文本聚类
The Model of Text Categorization Based on Latent Semantic Indexing

QIU Zhiyu , AN Yanhui.The Model of Text Categorization Based on Latent Semantic Indexing[J].Journal of Hebei Normal University,2012,36(1):24-26,83.

Authors:	QIU Zhiyu AN Yanhui

Institution:	1.College of Mathematics and Information Science,Hebei Normal University,Hebei Shijiazhuang 050024,China;2.Industry and Information Technology Department of Hebei Province,Hebei Shijiazhuang 050071,China)

Abstract:	Text categorization(TC),the foundation of text mining,can be used in information retrieval and web data mining.Before text categorization the text should be converted to a model that can be processed in computer at first.A new algorithm that combines latent semantic indexing(LSI) and text clustering is given.Through the experiment this algorithm is fouhe effective.

Keywords:	text categorization latent semantic indexing text clustering
本文献已被 CNKI 万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏