首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于层次聚类法的Entropy-KNN算法
引用本文:童先群,周忠眉.基于层次聚类法的Entropy-KNN算法[J].漳州师院学报,2012(1):43-47.
作者姓名:童先群  周忠眉
作者单位:漳州师范学院计算机科学与工程系,福建漳州363000
基金项目:国家自然科学基金资助项目(61170129)
摘    要:KNN算法通过近邻样本的个数分类,Entropy-KNN算法给出新的相似度定义,而且投票时综合待测样本与近邻样本的个数和各类近邻的平均距离,但两种算法均未考虑近邻样本间的相似.提出的基于层次聚类法的Entropy-KNN算法,首先对训练集按类别进行层次聚类,接着在与待测样本最相似的子类中选取近邻样本,使得近邻样本具有较高的相似度,最后结合Entropy-KNN算法进行分类.在蘑菇数据集上的实验结果表明,该算法的分类准确率高于Entropy-KNN算法.

关 键 词:分类  KNN算法  信息熵  聚类

Entropy-KNN Algorithm Based on Hierarchical Clustering
TONG Xian-qun,ZHOU Zhong-mei.Entropy-KNN Algorithm Based on Hierarchical Clustering[J].Journal of ZhangZhou Teachers College(Philosophy & Social Sciences),2012(1):43-47.
Authors:TONG Xian-qun  ZHOU Zhong-mei
Institution:(Department of Computer Science & Engineering, Zhangzhou Normal University, Zhangzhou, Fujian 363000, China)
Abstract:The class label of the test sample on KNN is decided by the K nearest neighbors numbers on the respective class. On algorithm Entropy-KNN, we not only define a distance of the two samples, but also decide the class label of the test sample by the average distance and the numbers on the respective class. But they are not focus on the similarity degree of the K nearest neighbors, which is useful to the class label of the test sample. On the contrary, we propose Entropy-KNN algorithm Based on clustering. At first, the samples of the different class label are clustered. Second, we select the K nearest neighbors from the child clusters, which is nearest to the test sample. Finally, we decide the class label of the test sample by algorithm Entropy-KNN. We perform our experiments on mushroom data set. The experimental results show that our approach has much better than algorithm Entropy-KNN.
Keywords:classification  K- Nearest Neighbor algorithm  information entropy  clustering
本文献已被 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号