首页 | 本学科首页   官方微博 | 高级检索  
     


MCS: A Method for Finding the Number of Clusters
Authors:Ahmed N. Albatineh  Magdalena Niewiadomska-Bugaj
Affiliation:(1) Department of Biostatistics, Bioinformatics, and Epidemiology, Medical University of South Carolina, Charleston, SC, USA;(2) Department of Biochemistry and Molecular Biology, Medical University of South Carolina, Charleston, SC, USA;(3) Marine Biomedicine and Environmental Science Center, Medical University of South Carolina, Charleston, SC, USA;(4) South Carolina Department of Natural Resources, Marine Resources Research Institute, Charleston, SC, USA
Abstract:This paper proposes a maximum clustering similarity (MCS) method for determining the number of clusters in a data set by studying the behavior of similarity indices comparing two (of several) clustering methods. The similarity between the two clusterings is calculated at the same number of clusters, using the indices of Rand (R), Fowlkes and Mallows (FM), and Kulczynski (K) each corrected for chance agreement. The number of clusters at which the index attains its maximum is a candidate for the optimal number of clusters. The proposed method is applied to simulated bivariate normal data, and further extended for use in circular data. Its performance is compared to the criteria discussed in Tibshirani, Walther, and Hastie (2001). The proposed method is not based on any distributional or data assumption which makes it widely applicable to any type of data that can be clustered using at least two clustering algorithms.
Keywords:
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号