不完整数据的聚类研究 Research on Incomplete Data Clustering期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

不完整数据的聚类研究

引用本文：	冷泳林,张清辰,鲁富宇.不完整数据的聚类研究[J].河南科学,2014(11):2259-2262.

作者姓名：	冷泳林张清辰鲁富宇

作者单位：	1. 渤海大学信息科学与技术学院,辽宁锦州,121000 2. 大连理工大学软件学院,辽宁大连,116620

基金项目：	辽宁省自然科学基金(2013020014)；中国高等职业技术教育研究会规划课题(GZYGH1213036，GZYGH1213035)；省社科联2014年度辽宁经济社会发展立项课题

摘要：	数据采集过程中存在大量缺失数据,即不完整数据.传统方法在聚类不完整数据时采用填充或丢弃缺失数据方式实现数据的聚类.依据不完整信息系统理论提出一种基于K-means的不完整数据聚类算法,算法首先将数据集划分成完整数据集和非完整数据集两部分,对完整数据集采用K-means算法进行聚类,然后将不完整数据依据设计的相似度度量方法划分到对应的簇中,实现数据集的聚类.实验结果表明,提出的方法能够高效、精确地实现不完整数据聚类.
关键词：	不完整数据 K-means聚类不完整信息系统
Research on Incomplete Data Clustering

Leng Yonglin,Zhang Qingchen,Lu Fuyu.Research on Incomplete Data Clustering[J].Henan Science,2014(11):2259-2262.

Authors:	Leng Yonglin Zhang Qingchen Lu Fuyu

Institution:	Leng Yonglin, Zhang Qingchen, Lu Fuyu (1. College of Information Science and Technology, Bohai University, Jinzhou 121000, Liaoning China ;2. School of Software Technology, Dalian University of Technology, Dalian 116620, Liaoning China)

Abstract:	A large number of missing data exist in the process of data collection,which are called incomplete data.Traditional methods in clustering incomplete data use imputation or discarding strategy for data clustering. In thispaper,we propose a K-means clustering of incomplete data based on the incomplete information system theory. Thealgorithm firstly divides the data set into a complete data set and the incomplete data set,and using K- meansalgorithm for the complete data set clustering. Then the incomplete data are divided into the corresponding clustersbased on the design division of similarity measurement. Experiment demonstrates that the proposed algorithm cancluster the incomplete big data directly and improve the accuracy and effectivity.

Keywords:	incomplete data K-means clustering incomplete information system
本文献已被 CNKI 维普万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏