Similarity measurement method of high-dimensional data based on normalized net lattice subspace |
| |
Authors: | Li Wenfa Wang Gongming Li Ke Huang Su |
| |
Affiliation: | 1. Beijing Key Laboratory of Information Service Engineering,Beijing Union University, Beijing 100101, P.R.China;2. National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, P.R.China |
| |
Abstract: | The performance of conventional similarity measurement methods is affected seriously by the curse of dimensionality of high-dimensional data.The reason is that data difference between sparse and noisy dimensionalities occupies a large proportion of the similarity, leading to the dissimilarities between any results.A similarity measurement method of high-dimensional data based on normalized net lattice subspace is proposed.The data range of each dimension is divided into several intervals, and the components in different dimensions are mapped onto the corresponding interval.Only the component in the same or adjacent interval is used to calculate the similarity.To validate this meth-od, three data types are used, and seven common similarity measurement methods are compared. The experimental result indicates that the relative difference of the method is increasing with the di-mensionality and is approximately two or three orders of magnitude higher than the conventional method.In addition, the similarity range of this method in different dimensions is [0, 1], which is fit for similarity analysis after dimensionality reduction. |
| |
Keywords: | high-dimensional data the curse of dimensionality similarity normalization subspace NPsim |
本文献已被 CNKI 万方数据 等数据库收录! |
|