首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
One key point in cluster analysis is to determine a similarity or dissimilarity measure between data objects. When working with time series, the concept of similarity can be established in different ways. In this paper, several non-parametric statistics originally designed to test the equality of the log-spectra of two stochastic processes are proposed as dissimilarity measures between time series data. Their behavior in time series clustering is analyzed throughout a simulation study, and compared with the performance of several model-free and model-based dissimilarity measures. Up to three different classification settings were considered: (i) to distinguish between stationary and non-stationary time series, (ii) to classify different ARMA processes and (iii) to classify several non-linear time series models. As it was expected, the performance of a particular dissimilarity metric strongly depended on the type of processes subjected to clustering. Among all the measures studied, the nonparametric distances showed the most robust behavior.  相似文献   

2.
Two fundamental approaches to the comparison of classifications (e g, partitions on the same finite set of objects) can be distinguished One approach is based upon measures of metric dissimilarity while the other is based upon measures of similarity, or consensus These approaches are not necessarily simple complements of each other Instead, each captures different, limited views of comparison of two classifications The properties of these measures are clarified by their relationships to Day's complexity models and to association measures of numerical taxonomy The two approaches to comparison are equated with the use of separation and minimum value sensitive measures, suggesting the potential application of an intermediate sensitive measure to the problem of comparison of classifications Such a measure is a linear combination of separation sensitive and minimum value sensitive components The application of these intermediate measures is contrasted with the two extremes The intermediate measure for the comparison of classifications is applied to a problem of character weighting arising in the analysis of Australian stream basinsWe thank Bill Day, Mike Austin, Peter Minchin and two anonymous referees for many helpful comments We also thank P Arabie for useful discussion of consensus methods and character weighting  相似文献   

3.
Free-sorting data are obtained when subjects are given a set of objects and are asked to divide them into subsets. Such data are usually reduced by counting for each pair of objects, how many subjects placed both of them into the same subset. The present study examines the utility of a group of additional statistics. the cooccurrences of sets of three objects. Because there are dependencies among the pair and triple cooccurrences, adjusted triple similarity statistics are developed. Multidimensional scaling and cluster analysis — which usually use pair similarities as their input data — can be modified to operate on three-way similarities to create representations of the set of objects. Such methods are applied to a set of empirical sorting data: Rosenberg and Kim's (1975) fifteen kinship terms.The author thanks Phipps Arabie, Lawrence Hubert, Lawrence Jones, Ed Shoben, and Stanley Wasserman for their considerable contributions to this paper.  相似文献   

4.
A procedure is presented which permits the analysis of factor analytic problems in which several groups exist. The analysis incorporates a hierarchical scheme of searching for factorial invariance and is an extension of Meredith's (1964) Method One procedure. By overlaying a contextual frame of reference on a traditional factor analysis solution, it is possible to use this technique to examine structural similarity and dissimilarity between groups. The procedure is exhibited in an example and in addition a comparison is made to discriminant analysis.  相似文献   

5.
Many similarity coefficients for binary data are defined as fractions. For certain resemblance measures the denominator may become zero. If the denominator is zero the value of the coefficient is indeterminate. It is shown that the seriousness of the indeterminacy problem differs with the resemblance measures. Following Batagelj and Bren (1995) we remove the indeterminacies by defining appropriate values in critical cases. The author would like to thank three anonymous reviewers for their helpful comments and valuable suggestions on earlier versions of this article.  相似文献   

6.
We introduce new similarity measures between two subjects, with reference to variables with multiple categories. In contrast to traditionally used similarity indices, they also take into account the frequency of the categories of each attribute in the sample. This feature is useful when dealing with rare categories, since it makes sense to differently evaluate the pairwise presence of a rare category from the pairwise presence of a widespread one. A weighting criterion for each category derived from Shannon??s information theory is suggested. There are two versions of the weighted index: one for independent categorical variables and one for dependent variables. The suitability of the proposed indices is shown in this paper using both simulated and real world data sets.  相似文献   

7.
在西方,专家们对“自主学习”这一概念应使用的术语争论不已。文章从纵向角度探析了英语中关于“自主学习”的不同术语及其含义,剖析了不同术语之间的关系,比较了中西文化中“自主学习”的不同内涵,并探讨了正确理解“自主学习”含义的重大意义。  相似文献   

8.
在西方,专家们对"自主学习"这一概念应使用的术语争论不已。文章从纵向角度探析了英语中关于"自主学习"的不同术语及其含义,剖析了不同术语之间的关系,比较了中西文化中"自主学习"的不同内涵,并探讨了正确理解"自主学习"含义的重大意义。  相似文献   

9.
k-Adic formulations (for groups of objects of size k) of a variety of 2-adic similarity coefficients (for pairs of objects) for binary (presence/absence) data are presented. The formulations are not functions of 2-adic similarity coefficients. Instead, the main objective of the the paper is to present k-adic formulations that reflect certain basic characteristics of, and have a similar interpretation as, their 2-adic versions. Two major classes are distinguished. The first class is referred to as Bennani-Heiser similarity coefficients, which contains all coefficients that can be defined using just the matches, the number of attributes that are present and that are absent in k objects, and the total number of attributes. The coefficients in the second class can be formulated as functions of Dice’s association indices. The author thanks Willem Heiser and three anonymous reviewers for their helpful comments and valuable suggestions on earlier versions of this article.  相似文献   

10.
信息化条件下组织绩效的评价范式   总被引:4,自引:0,他引:4  
企业信息化作为国家信息化和社会信息化建设的基础和重要组成部分,已成为提高企业管理水平、增强企业竞争力的战略措施.但一直以来,对信息技术应用的收益与影响进行评价都被认为是一个棘手的问题.本文从企业绩效变革和信息化二者耦合的角度出发,辩证地批判了"信息悖论"成因及基于产出和基于行为的信息化绩效评价理论存在的缺陷,并通过对信息化绩效的衍生机理和绩效特征的分析研究,构建了以过程分析思想为导向的信息化绩效评价范式.  相似文献   

11.
The problem of measuring the impact of individual data points in a cluster analysis is examined. The purpose is to identify those data points that have an influence on the resulting cluster partitions. Influence of a single data point is considered present when different cluster partitions result from the removal of the element from the data set. The Hubert and Arabie (1985) corrected Rand index was used to provide numerical measures of influence of a data point. Simulated data sets consisting of a variety of cluster structures and error conditions were generated to validate the influence measures. The results showed that the measure of internal influence was 100% accurate in identifying those data elements exhibiting an influential effect. The nature of the influence, whether beneficial or detrimental to the clustering, can be evaluated with the use of the gamma and point-biserial statistics.  相似文献   

12.
Analysis of between-group differences using canonical variates assumes equality of population covariance matrices. Sometimes these matrices are sufficiently different for the null hypothesis of equality to be rejected, but there exist some common features which should be exploited in any analysis. The common principal component model is often suitable in such circumstances, and this model is shown to be appropriate in a practical example. Two methods for between-group analysis are proposed when this model replaces the equal dispersion matrix assumption. One method is by extension of the two-stage approach to canonical variate analysis using sequential principal component analyses as described by Campbell and Atchley (1981). The second method is by definition of a distance function between populations satisfying the common principal component model, followed by metric scaling of the resulting between-populations distance matrix. The two methods are compared with each other and with ordinary canonical variate analysis on the previously introduced data set.  相似文献   

13.
Similarity measures are entities that can be used to quantify the similarity between two vectors with real numbers. We present inequalities between seven well known similarities. The inequalities are valid if the vectors contain non-negative real numbers.  相似文献   

14.
Analytic procedures for classifying objects are commonly based on the product-moment correlation as a measure of object similarity. This statistic, however, generally does not represent an invariant index of similarity between two objects if they are measured along different bipolar variables where the direction of measurement for each variable is arbitrary. A computer simulation study compared Cohen's (1969) proposed solution to the problem, the invariant similarity coefficientr c , with the mean product-moment correlation based on all possible changes in the measurement direction of individual variables within a profile of scores. The empirical observation thatr c approaches the mean product-moment correlation with increases in the number of scores in the profiles was interpreted as encouragement for the use ofr c in classification research. Some cautions regarding its application were noted.This research was supported by the Social Sciences and Humanities Research Council of Canada, Grant no. 410-83-0633, and by the University of Toronto.  相似文献   

15.
本文考察了波普尔和库恩科学哲学在科学哲学界、科学界及人文社科界的影响。本文的情境分析根据波普尔和库恩作品的特点、社会联系、社会情境、个性因素来解释他们的影响。文章的考察实际上触及了思想史上的社会建构因素,而此因素应该广泛作用于各学科的历史。  相似文献   

16.
Interpreting a taxonomic tree as a set of objects leads to natural measures of complexity and similarity, and sets natural lower bounds on a consensus tree Interpretations differing as to the kind of objects constituting a tree lead to different measures and consensus Subset nesting is preferred over the clusters (strict consensus) and even the triads interpretations because of its superior expression of shared structure Algorithms for computing the complexity and similarity of trees, as well as a consensus index onto [0,1], are presented for this interpretation The full consensus is defined as the only tree which includes all the nestings shared in a profile of rival trees and whose clusters reflect only nestings shared in the profile The full consensus is proved to exist uniquely for each profile, and to equal the Adams consensusThe author is grateful for the many helpful comments on presentation from Frances McA Adams, William H E Day, and Christopher A Meacham  相似文献   

17.
Non-symmetrical correspondence analysis (NSCA) is a very practical statistical technique for the identification of the structure of association between asymmetrically related categorical variables forming a contingency table. This paper considers some tools that can be used to numerically and graphically explore in detail the association between these variables and include the use of confidence regions, the establishment of the link between NSCA and the analysis of variance of categorical variables, and the effect of imposing linear constraints on a variable. The authors would like to thank the anonymous referees for their comments and suggestions during the preparation of this paper.  相似文献   

18.
针对描述cDNA、DNA、氨基酸和蛋白质序列等亲缘关系的3个专用词语——同源性、一致性和相似性在生物类论文中交错使用的问题,对其具体含义和运用进行了分析,以明晰各词语在论文中的准确使用,提高编辑校对质量。  相似文献   

19.
针对描述CDNA、DNA、氨基酸和蛋白质序列等亲缘关系的3个专用词语——同源性、一致性和相似性在生物类论文中交错使用的问题,对其具体含义和运用进行了分析,以明晰各词语在论文中的准确使用,提高编辑校对质量。  相似文献   

20.
从阶级分析到生产力分析   总被引:5,自引:0,他引:5  
推荐一种新的思维模式,生产力分析的方法。运用历史唯物主义理论研究各种社会、历史、经济、政治、文化问题,必须应用生产力分析的方法。过去我们习惯地使用阶级分析的方法。其实,即使在革命战争时期,阶级的划分,阶级的历史地位和作用,也是由生产力的发展来决定的。应用这一“生产力分析”的思维模式,将能帮助我们认识和理解当前中国社会发展的某些问题,分析和理解当代发达国家社会经济发展的走向,也将能帮助我们探讨和认识腐败问题。学习十六大文件精神,就要学会“生产力分析”的方法。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号