首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于模糊C-均值聚类的缺失数据填充方法
引用本文:黄紫成,李影.基于模糊C-均值聚类的缺失数据填充方法[J].吉首大学学报(自然科学版),2020,41(2):23-26.
作者姓名:黄紫成  李影
作者单位:(仰恩大学工程技术学院,福建 泉州 362014)
基金项目:福建省中青年教师教育科研项目;福建省科技厅软科学研究计划
摘    要:针对缺失数据的有效填充问题,提出利用模糊C-均值聚类(FCM)算法的隶属度矩阵作为待填数据的加权权重.首先使用同一属性均值对缺失数据作预填充,再进行FCM以得到每个类别的隶属度矩阵,最后用该矩阵作为权重去乘以每个类别的属性均值,得到最终的填充数据.在UCI数据实验中,将FCM填充算法与k近邻(KNN)填充算法作对比分析,结果表明,FCM填充得到的均方根误差总体小于KNN填充.

关 键 词:缺失数据  模糊C-均值聚类  隶属度矩阵  k近邻  

Missing Value Filling Method Based on Fuzzy C-Means Algorithm
HANG Zicheng,LI Ying.Missing Value Filling Method Based on Fuzzy C-Means Algorithm[J].Journal of Jishou University(Natural Science Edition),2020,41(2):23-26.
Authors:HANG Zicheng  LI Ying
Institution:(College of Engineering Technology, Yang-En University, Quanzhou 362014, Fujian China)
Abstract:For effective missing data filling, the membership matrix of fuzzyC-means algorithm is proposed as the weighted weight of the data to be filled in. Firstly, the missing data is pre-filled with the same attribute mean, then the membership matrix of each category is obtained by means of fuzzyC-means algorithm. Finally, the matrix is used as the weight to multiply the attribute mean of each category as the final filling data. In the UCI data experiment, compared with the KNN filling, the results show that the error in the method based on the fuzzyC-means algorithm filling is smaller than in the KNN filling.
Keywords:missing value  fuzzy C-means algorithm  membership matrix  k-nearest neighbor  
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《吉首大学学报(自然科学版)》浏览原始摘要信息
点击此处可从《吉首大学学报(自然科学版)》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号