首页 | 本学科首页   官方微博 | 高级检索  
     检索      

2种加速K-近邻方法的实验比较
引用本文:翟俊海,王婷婷,张明阳,王耀达,刘明明.2种加速K-近邻方法的实验比较[J].河北大学学报(自然科学版),2016,36(6):650.
作者姓名:翟俊海  王婷婷  张明阳  王耀达  刘明明
基金项目:国家自然科学基金资助项目(71371063);河北省高等学校科学技术研究重点项目(ZD20131028);河北大学研究生创新项目(X2016059)
摘    要:K-近邻(K-NN:K-nearest neighbors)是著名的数据挖掘算法,应用非常广泛.K-NN思想简单,易于实现,其计算时间复杂度和空间复杂度都是O(n),n为训练集中包含的样例数.当训练集比较大时,特别是面对大数据集时,K-NN算法的效率会变得非常低,甚至不可行.本文用实验的方法比较了2种加速K-NN的方法,2种加速方法分别是压缩近邻(CNN:condensed nearest neighbor)方法和基于MapReduce的K-NN.具体地,在Hadoop环境下,用MapReduce编程实现了K-NN算法,并与CNN算法在8个数据集上进行了实验比较,得出了一些有价值的结论,对从事相关研究的人员具有一定的借鉴作用.

关 键 词:K-近邻  数据挖掘  MapReduce  Hadoop  
收稿时间:2016-07-11

Experimental comparison of two acceleration approaches for K-nearest neighbors
ZHAI Junhai,WANG Tingting,ZHANG Mingyang,WANG Yaoda,LIU Mingming.Experimental comparison of two acceleration approaches for K-nearest neighbors[J].Journal of Hebei University (Natural Science Edition),2016,36(6):650.
Authors:ZHAI Junhai  WANG Tingting  ZHANG Mingyang  WANG Yaoda  LIU Mingming
Institution:College of Mathematics and Information Science, Hebei University, Baoding 071002, China
Abstract:K-NN(K-nearest neighbors)is a famous data mining algorithm with wide range of applications.The idea of K-NN is simple and it is easy to implement. Both computational time and space complexity of K-NN are all O(n),where,n is the number of instances in a training set.When K-NN encountered larger training sets,especially faced with big data sets,the efficiency of K-NN becomes very low,even K-NN is impracticable.Two acceleration approaches for K-nearest neighbors are experimentally compared on 8 data sets.The two acceleration approaches are the CNN and MapReduce based K-NN.Specifically,in Hadoop environment,this paper implements K-NN with MapReduce,and experimentally compares with CNN on 8 data sets. Some valuable conclusions are obtained,and may be useful for researchers in related fields.
Keywords:K-nearest neighbors  data mining  MapReduce  Hadoop  
本文献已被 CNKI 等数据库收录!
点击此处可从《河北大学学报(自然科学版)》浏览原始摘要信息
点击此处可从《河北大学学报(自然科学版)》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号