首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于邻域和密度的异常点检测算法
引用本文:陶运信,皮德常.基于邻域和密度的异常点检测算法[J].吉林大学学报(信息科学版),2008,26(4).
作者姓名:陶运信  皮德常
作者单位:南京航空航天大学,信息科学与技术学院,南京,210016
基金项目:国家高技术研究发展计划(863)基金资助项目
摘    要:为了减少基于密度的异常点检测算法邻域查询操作的次数,同时避免ODBSN(Outlier Detection Based onSquare Neighborhood)中有意义异常点的丢失和稀疏聚类中的对象靠近稠密聚类时导致错误的异常点判断,提出了一种基于邻域和密度的异常点检测算法NDOD(Neighborhood and Density based Outlier Detection)。NDOD吸收基于网格方法的思想,以广度优先扩张方形邻域,成倍地减少了邻域查询的次数,从而快速排除聚类点并克服基于网格方法中的"维灾"。新引入的基于邻域的局部异常因子代表候选异常点的异常程度,用于对候选异常点的精选,可避免ODBSN的缺陷,发现更多有意义的异常点。大规模和任意形状的二维空间数据的测试结果表明,该算法是可行有效的。

关 键 词:数据挖掘  异常点  方形邻域  密度  局部异常因子

Outlier Detection Algorithm Based on Neighborhood and Density
TAO Yun-xin,PI De-chang.Outlier Detection Algorithm Based on Neighborhood and Density[J].Journal of Jilin University:Information Sci Ed,2008,26(4).
Authors:TAO Yun-xin  PI De-chang
Abstract:ODBSN(Outlier Detection Based on Square Neighborhood) may lose outliers and result in wrong estimation when objects from a sparse cluster close to a denser cluster.To avoid the shortcomings of ODBSN and reduce neighborhood query of representative density based algorithm,a new neighborhood and density based outlier detection algorithm named NDOD(Neighborhood and Density based Outlier Detection) is proposed.By the grid-based method,NDOD expands square neighborhood by breadth-first search,it can reduce neighborhood query drastically,eliminate cluster point quickly and overcome "curse of dimensionality" of grid-based method.A novel neighborhood based local outlier factor is defined for candidate outliers.As a result,outliers so discovered are own a degree of being an outlier and more meaningful.Extensive experiments on large-scale and different shape data sets demonstrate that our algorithm is effective and feasible.
Keywords:data mining  outlier  square neighborhood  density  local outlier factor
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号