首页 | 本学科首页   官方微博 | 高级检索  
     检索      

一种基于局部分类精度的概念漂移数据流分类算法
引用本文:张玲,马士伦,黎利辉,文益民.一种基于局部分类精度的概念漂移数据流分类算法[J].广西科学,2024,31(1):100-109.
作者姓名:张玲  马士伦  黎利辉  文益民
作者单位:桂林电子科技大学, 广西图像与图形智能处理重点实验室, 广西桂林 541004
基金项目:国家自然科学基金项目(62366011),广西重点研发计划项目(桂科AB21220023),广西图像图形与智能处理重点实验室项目(GIIP2306)资助。
摘    要:概念漂移数据流分类是一个极具挑战性的问题。当新概念出现时,该概念下的学习样本过少,无法对分类器进行及时调整,进而导致分类精度不高。为了解决该问题,本文提出一种基于局部分类精度的概念漂移数据流分类算法——LA-MS-CDC。第一,LA-MS-CDC将k-means聚类和局部分类精度算法结合,从分类器池中挑选出最优源领域分类器;第二,将最优源领域分类器与目标领域分类器加权集成,进而对样本分类;第三,根据分类样本的真实标签分别计算各分类器的损失,并对目标领域和源领域的分类器权重进行更新;第四,再利用该分类样本对目标领域分类器、最优源领域分类器进行更新;最后,完成分类器池的更新。在公开数据集上的实验结果表明,LA-MS-CDC能够有效地将源领域知识迁移到目标领域,与现有方法相比,其分类效果具有显著性提升。算法代码可在https://gitee.com/ymw12345/LAMSCDC上获取。

关 键 词:概念漂移  多源在线迁移学习  局部分类精度  集成学习  多样性
收稿时间:2023/3/8 0:00:00
修稿时间:2023/3/31 0:00:00

A Concept Drift Data Stream Classification Algorithm Based on Local Classification Accuracy
ZHANG Ling,MA Shilun,LI Lihui,WEN Yimin.A Concept Drift Data Stream Classification Algorithm Based on Local Classification Accuracy[J].Guangxi Sciences,2024,31(1):100-109.
Authors:ZHANG Ling  MA Shilun  LI Lihui  WEN Yimin
Institution:Guangxi Key Laboratory of Image and Graphic Intelligent Processing, Guilin University of Electronic Technology, Guilin, Guangxi, 541004, China
Abstract:The classification of concept drift data streams is a challenging problem.When a new concept appears,there are too few learning samples of the concept,and the classifier cannot be adjusted in time,which leads to low classification accuracy.In order to solve this problem,this article proposes a concept drift data stream classification algorithm,called LA-MS-CDC,based on local classification accuracy.Firstly,LA-MS-CDC combines k-means clustering and local classification accuracy algorithm to select the optimal source domain classifier from the classifier pool.Secondly,the optimal source domain classifier and the target domain classifier are weighted and integrated to classify the samples.Then,according to the real labels of the classification samples,the loss of each classifier is calculated respectively and the weights of the classifiers in the target domain and the source domain are updated. Then,the classification samples are used to update the target domain classifier and the optimal source domain classifier. Finally,the update of the classifier pool is completed. The experimental results on the public datasets show that LA-MS-CDC can effectively transfer the source domain knowledge to the target domain,and the classification effect of LA-MS-CDC is significantly improved compared with the existing methods. The algorithm code can be obtained on https://gitee.com/ymw12345/LAMSCDC.
Keywords:concept drift  multi-source online transfer learning  local classification accuracy  ensemble learning  diversity
点击此处可从《广西科学》浏览原始摘要信息
点击此处可从《广西科学》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号