首页 | 本学科首页   官方微博 | 高级检索  
     

基于权重搜索树改进K近邻的高维分类算法
引用本文:梁淑蓉,陈基漓,谢晓兰. 基于权重搜索树改进K近邻的高维分类算法[J]. 科学技术与工程, 2021, 21(7): 2760-2766. DOI: 10.3969/j.issn.1671-1815.2021.07.029
作者姓名:梁淑蓉  陈基漓  谢晓兰
作者单位:桂林理工大学信息科学与工程学院,桂林 514004;桂林理工大学信息科学与工程学院,桂林 514004;广西嵌入式技术与智能系统重点实验室,桂林 514004
基金项目:国家自然科学基金资助项目(61762031);广西科技重大专项 (桂科AA19046004);广西重点研发项目(桂科AB18126006)
摘    要:信息采集技术日益发展导致的高维、大规模数据,给数据挖掘带来了巨大挑战,针对K近邻分类算法在高维数据分类中存在效率低、时间成本高的问题,提出基于权重搜索树改进K近邻(K-nearest neighbor algorithm based on weight search tree,KNN-WST)的高维分类算法,该算法根据...

关 键 词:高维数据  K近邻分类算法  特征属性  搜索树  闵氏距离
收稿时间:2020-06-29
修稿时间:2020-12-08

Improved k-nearest neighbor algorithm based on weight search tree for high-dimensional classification
Liang Shurong,Chen Jili,Xie Xiaolan. Improved k-nearest neighbor algorithm based on weight search tree for high-dimensional classification[J]. Science Technology and Engineering, 2021, 21(7): 2760-2766. DOI: 10.3969/j.issn.1671-1815.2021.07.029
Authors:Liang Shurong  Chen Jili  Xie Xiaolan
Affiliation:College of Information Science and Engineering, Guilin University of Technology,,
Abstract:The ongoing development of information acquisition technique resulted in high-dimensional and large-scale data, it enormously challenges the data mining. Aiming at low efficiency and high time cost of K-nearest neighbor classification algorithms in high-dimensional data, an Improved k-nearest neighbor algorithm based on weight search tree (WSTKnn) for high-dimensional classification was proposed in this paper. The algorithm selects some attributes as nodes to construct a search tree according to the weight of feature attributes. The search tree divides the data set into different matrix regions. Unknown samples need to find the search tree to obtain the most similar matrix region, and only calculate the distance from the data contained in the matrix area. thus, reduce data size to reduce time complexity. And discussed the Minkowski Distance that would be most suitable for distance measurement of high-dimensional data. Simulation experiments on 6 standard high-dimensional data show that the classification time of WSTKnn has better performance than K-nearest neighbor, Decision Tree and SVM, the classification time is significantly reduced and the classification accuracy is better than other algorithms. WSTKnn has better performance on the classification of high-dimensional data, which is expected to give some references for solving the related problem of high-dimensional data.
Keywords:high-dimensional data   k-nearest neighbor classification   characteristic attribute   search tree   Minkowski Distance
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《科学技术与工程》浏览原始摘要信息
点击此处可从《科学技术与工程》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号