首页 | 本学科首页   官方微博 | 高级检索  
     检索      


A Chinese Web page clustering algorithm based on the suffix tree
Authors:Email author" target="_blank">Yang?Jian-wuEmail author
Institution:(1) National Key Laboratory for Text Processing, Institute of Computer Science and Technology, Peking University, 100871 Beijing, China
Abstract:In this paper, an improved algorithm, named STC\|I, is proposed for Chinese Web page clustering based on Chinese language characteristics, which adopts a new unit choice principle and a novel suffix tree construction policy. The experimental results show that the new algorithm keeps advantages of STC, and is better than STC in precision and speed when they are used to cluster Chinese Web page.
Keywords:clustering  suffix tree  Web mining
本文献已被 CNKI 维普 万方数据 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号