首页 | 本学科首页   官方微博 | 高级检索  
     

融合相对密度与近邻关系的密度峰值聚类算法
引用本文:代永杨,张清华,支学超. 融合相对密度与近邻关系的密度峰值聚类算法[J]. 重庆邮电大学学报(自然科学版), 2021, 33(5): 791-805. DOI: 10.3979/j.issn.1673-825X.202105220174
作者姓名:代永杨  张清华  支学超
作者单位:重庆邮电大学 计算智能重庆市重点实验室,重庆400065
基金项目:国家重点研发计划(2020YFC2003502);国家自然科学基金(61876201)
摘    要:密度峰值聚类算法(density peaks cluster,DPC)是一种基于密度的聚类算法,该算法可以聚类任意形状的类簇.在类簇间有密度差距的数据集上,DPC不能准确地选择聚类中心.DPC的非中心点分配策略会引起连续错误,影响算法的聚类效果.模糊k近邻密度峰值算法(fuzzy k-nearest neighbor DPC,FKNN-DPC)是一种改进的DPC算法,该算法采用边界点检测并结合2步分配策略来避免连续错误.当类簇间有密度差距时,FKNN-DPC的边界点检测效果不理想,此外,其非中心点分配策略缺乏对样本近邻信息的考虑.定义相对密度(relative density)并结合近邻关系(nearest neighbor relationship)提出RN-DPC算法解决上述问题.针对DPC因为类簇间的密度差距而不能准确选择聚类中心的问题,定义相对密度用于消除类簇间的密度差距.基于反向k近邻关系检测边界点并且引入共享最近邻关系来对FKNN-DPC的分配策略进行改进.RN-DPC算法在人工数据集和真实数据集上分别与不同的聚类算法进行了对比,实验结果验证了RN-DPC算法的有效性和合理性.

关 键 词:聚类  密度峰值  近邻关系  边界点检测  近邻分配
收稿时间:2021-05-22
修稿时间:2021-06-27

Density peaks clustering algorithm by combining relative density with nearest neighbor relationship
DAI Yongyang,ZHANG Qinghua,ZHI Xuechao. Density peaks clustering algorithm by combining relative density with nearest neighbor relationship[J]. Journal of Chongqing University of Posts and Telecommunications, 2021, 33(5): 791-805. DOI: 10.3979/j.issn.1673-825X.202105220174
Authors:DAI Yongyang  ZHANG Qinghua  ZHI Xuechao
Affiliation:Chongqing Key Laboratory of Computational Intelligence, Chongqing University of Posts and Telecommunications, Chongqing 400065, P. R. China
Abstract:Density peaks cluster (DPC) is a clustering algorithm based on density, which can find clusters with arbitrary shape. However, DPC cannot select clustering centers when there is density gap among clusters. Moreover, the non-center points allocation strategy of DPC will cause continuous errors and affect the clustering performance of the algorithm. A method combining boundary detection and two-step allocation strategy of non-center points is proposed in fuzzy k-nearest neighbor DPC (FKNN-DPC) to avoid continuous errors. However, the boundary detection method of FKNN-DPC cannot handle clusters with density gap, and its non-center points allocation strategy does not take into account the nearest neighbor information of data points. To address these issues, an improved DPC algorithm based on Relative density and nearest neighbor relationship (RN-DPC) is proposed. First, the relative density is defined to eliminate density gap among clusters, which can solve the issue that DPC cannot select correct clustering centers when there is density gap among clusters. Then, reverse k-nearest neighbor relationship is used to detect boundary and shared-nearest neighbor relationship is introduced to improve the allocation strategy of non-center points in FKNN-DPC. Finally, the proposed algorithm is benchmarked on synthetic and real-world datasets with different clustering algorithms. The experimental results demonstrate the effectiveness and rationality of the proposed algorithm in this paper.
Keywords:cluster  density peaks  nearest neighbor relationship  boundary detection  nearest neighbor assignment
本文献已被 万方数据 等数据库收录!
点击此处可从《重庆邮电大学学报(自然科学版)》浏览原始摘要信息
点击此处可从《重庆邮电大学学报(自然科学版)》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号