首页 | 本学科首页   官方微博 | 高级检索  
     检索      

An Incremental Algorithm of Text Clustering Based on Semantic Sequences
作者姓名:FENG  Zhonghui  SHEN  Junyi  BAO  Junpeng
作者单位:Institute of Computer Software,Xi'an Jiaotong University,Xi'an 710049, Shaannxi, China
摘    要:0 IntroductionText clusteringis the process of grouping the documentsinto the classes or clusters so that documents within acluster have high si milarityin comparisonto one another ,butare very dissi milar to documents in other clusters .In applica-tions ,the document is always represented by vector spacemodel(VSM) in which each document is represented as a vec-tor and each unique termis of one di mension of this vector .Then,documents are clustered bycalculating distance or si mi-larity1], …

关 键 词:文本聚类  语义顺序    递增算法
文章编号:1007-1202(2006)05-1340-05
收稿时间:2006-03-23

An incremental algorithm of text clustering based on semantic sequences
FENG Zhonghui SHEN Junyi BAO Junpeng.An Incremental Algorithm of Text Clustering Based on Semantic Sequences[J].Wuhan University Journal of Natural Sciences,2006,11(5):1340-1344.
Authors:Feng Zhonghui  Shen Junyi  Bao Junpeng
Institution:(1) Institute of Computer Software, Xi'an Jiaotong University, 710049 Xi'an, Shaannxi, China
Abstract:This paper proposed an incremental textclustering algorithm based on semantic sequence. Using similarity relation of semantic sequences and calculating the cover of similarity semantic sequences set, the candidate cluster with minimum entropy overlap value was selected as a result cluster every time in this algorithm. The comparison of experimental results shows that the precision of the algorithm is higher than other algorithms under same conditions and this is obvious especially on long documents set.
Keywords:text clustering  semantic sequence  entropy
本文献已被 CNKI 维普 万方数据 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号