首页 | 本学科首页   官方微博 | 高级检索  
     

高维数据流的聚类离群点检测算法研究
引用本文:程艳,苗永春.高维数据流的聚类离群点检测算法研究[J].江西师范大学学报(自然科学版),2014,0(5):449-453.
作者姓名:程艳  苗永春
作者单位:江西师范大学计算机信息工程学院,江西 南昌,330022
基金项目:国家社科基金教育学青年课题“教育虚拟社区的群集智能化构建方法研究”,国家自然科学基金地区基金
摘    要:针对基于聚类的离群点检测算法在处理高维数据流时效率和精确度低的问题,提出一种高维数据流的聚类离群点检测(CODHD-Stream)算法。该算法首先采用滑动窗口技术对数据流划分,然后通过属性约简算法对高维数据集降维;其次运用基于距离的信息熵过滤机制的 K-means 聚类算法将数据集划分成微聚类,并检测微聚类的离群点。通过实验结果分析表明:该算法可以有效提高高维数据流中离群点检测的效率和准确度。

关 键 词:高维数据流  滑动窗口  属性约简  K-均值  微聚类  信息熵  离群点检测

The Study on Clustering-Based Outlier Detection Algorithm for High-Dimensional Data Stream
CHENG Yan,MIAO Yong-chun.The Study on Clustering-Based Outlier Detection Algorithm for High-Dimensional Data Stream[J].Journal of Jiangxi Normal University (Natural Sciences Edition),2014,0(5):449-453.
Authors:CHENG Yan  MIAO Yong-chun
Affiliation:CHENG Yan;MIAO Yong-chun;College of Computer Information and Engineering,Jiangxi Normal University;
Abstract:The existing clustering-based outlier detection suffers from low efficiency and precision when dealing with high-dimensional data stream. To relieve this problem,an algorithm of clustering-based outlier detection for high-di-mensional data stream(CODHD-Stream)was presented. The algorithm used sliding window technology to divide the data stream. Then dimensions of high-dimensional data streams were reduced by an attribute reduction algorithm. Fi-nally,it divided the data set into a number of micro-clustering to detect outliers contained in the micro-clustering by the K-means method of the distance-based information entropy mechanism. The experimental analyses show that the proposed algorithm can effectively raise the speed and accuracy of outlier detection in high-dimensional data stream.
Keywords:high-dimensional data stream  sliding window  attribute reduction  K-means  micro-clustering  informa-tion entropy  outlier detection
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《江西师范大学学报(自然科学版)》浏览原始摘要信息
点击此处可从《江西师范大学学报(自然科学版)》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号