首页 | 本学科首页   官方微博 | 高级检索  
     检索      

一种面向不确定标签样本的K-近邻高效决策算法
引用本文:齐晴,沈正飞,曹健,应俊,赵龙.一种面向不确定标签样本的K-近邻高效决策算法[J].应用科学学报,2020,38(5):659-671.
作者姓名:齐晴  沈正飞  曹健  应俊  赵龙
作者单位:1. 上海交通大学 计算机科学与工程系, 上海 200240;2. 上海海勃物流软件有限公司, 上海 200080
基金项目:国家重点研发计划项目(No.2019YFB1704405)资助
摘    要:基于案例的决策是一种直接依据过去的历史案例对当前案例进行分类或者指标预测的方法,K-近邻方法就是一种广泛应用的基于案例的决策模型。在K-近邻方法中,历史案例上需要有标签,而在现实应用中,标签本身有一定的不确定性.文章详细地讨论了现有的基于K-近邻的决策方法忽略了样本标签不确定性这一问题,并基于Dempster-Shafer证据理论对标签不确定性进行建模以改善预测的性能,在此基础上结合边界树模型提高模型的运行效率.文中介绍了边界树算法的作用与原理,对如何结合传统边界树算法与样本标签的不确定性对边界树算法的节点转移策略以及决策过程进行了优化.文章最后对边界树算法的计算规模与准确率做了详细的实验论证.结果表明,文中提出的方法一方面考虑了标签的不确定性,另一方面提高了传统的K-近邻模型的决策效率.

关 键 词:K-近邻算法  标签不确定性  边界树算法  计算速度优化  
收稿时间:2020-06-19

An Efficient K-Nearest Neighbor Decision Algorithm for Samples with Uncertain Labels
QI Qing,SHEN Zhengfei,CAO Jian,YING Jun,ZHAO Long.An Efficient K-Nearest Neighbor Decision Algorithm for Samples with Uncertain Labels[J].Journal of Applied Sciences,2020,38(5):659-671.
Authors:QI Qing  SHEN Zhengfei  CAO Jian  YING Jun  ZHAO Long
Institution:1. Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai 200240, China;2. Shanghai Harbor e-Logistics Software Co., Ltd., Shanghai 200080, China
Abstract:Case-based decision-making is a method to directly classify or predict current cases based on past historical cases. The K-nearest neighbor method is a widely used casebased decision-making model. In the K-nearest neighbor method, historical cases need to be labeled. But in practical applications, the labels themselves have uncertainties. This article discusses the problem of label uncertainty which has been ignored in existing casebased decision-making methods in detail, and setups a label uncertainty model based on Dempster-Shafer evidence theory for improving prediction performance. In addition, in order to improve the operation efficiency, a new boundary tree algorithm by combining the traditional boundary tree algorithm and the label uncertainty is proposed. This paper introduces the function and principle of the boundary tree algorithm, and optimizes the node transfer strategy and decision process of the new boundary tree algorithm. Experimental demonstration shows that the proposed method not only takes the label uncertainty into consideration, but also improves the decision efficiency of the traditional K-nearest neighbor model.
Keywords:K-nearest neighbor algorithm  uncertainties of labels  boundary tree algorithm  optimization of decision speed  
本文献已被 CNKI 等数据库收录!
点击此处可从《应用科学学报》浏览原始摘要信息
点击此处可从《应用科学学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号