首页 | 本学科首页   官方微博 | 高级检索  
     检索      

K-最近邻的改进及其在文本分类中的应用
引用本文:寇莎莎,魏振军.K-最近邻的改进及其在文本分类中的应用[J].河南师范大学学报(自然科学版),2005,33(3):134-136.
作者姓名:寇莎莎  魏振军
作者单位:信息工程大学,信息工程学院,郑州,450002
摘    要:采用K近邻算法(Knearest neighbors,简称KNN)进行分类时,如果训练样本数量太大,那么搜索测试样本的K个最近邻时,算法的计算量很大.本文针对KNN的不足提出了一种改进方法.改进的KNN算法通过定义样本的延拓类和延拓能力,保留延拓能力强的样本作为它延拓类中其它训练样本的代表,来缩减训练样本数量,达到减少算法计算量的目的.实验证明,改进的KNN算法具有很好的性能.

关 键 词:延拓半径  延拓类  延拓能力  K最近邻算法
文章编号:1000-2367(2005)03-0134-03
收稿时间:2005-01-05
修稿时间:2005年1月5日

Improvement of K Nearest Neighbors and Applications in Text Classification
KOU Sha-sha,WEI Zhen-jun.Improvement of K Nearest Neighbors and Applications in Text Classification[J].Journal of Henan Normal University(Natural Science),2005,33(3):134-136.
Authors:KOU Sha-sha  WEI Zhen-jun
Abstract:When K nearest neighbor (KNN) is used to judge which category new texts belong to in text classification, KNN method may need much calculation to search k nearest neighbors of new texts if the number of training set is large. Introduce extension category and extension capability of a sample in the training set, An improved KNN method is presented, which can reduce the amount of training sets and select good samples from the training set to make up of a new training set to solve the problem above. The experiment result shows that it has good performance.
Keywords:extension radius  extension category  extension capability  KNN
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号