首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 187 毫秒
1.
Ordered set theory provides efficient tools for the problems of comparison and consensus of classifications Here, an overview of results obtained by the ordinal approach is presented Latticial or semilatticial structures of the main sets of classification models are described Many results on partitions are adaptable to dendrograms; many results on n-trees hold in any median semilattice and thus have counterparts on ordered trees and Buneman (phylogenetic) trees For the comparison of classifications, the semimodularity of the ordinal structures involved yields computable least-move metrics based on weighted or unweighted elementary transformations In the unweighted case, these metrics have simple characteristic properties For the consensus of classifications, the constructive, axiomatic, and optimization approaches are considered Natural consensus rules (majoritary, oligarchic, ) have adequate ordinal formalizations A unified presentation of Arrow-like characterization results is given In the cases of n-trees, ordered trees and Buneman trees, the majority rule is a significant example where the three approaches convergeThe authors would like to thank the anonymous referees for helpful suggestions on the first draft of this paper, and W H E Day for his comments and his significant improvements of style  相似文献   

2.
Consensus conferences are social techniques which involve bringing together a group of scientific experts, and sometimes also non-experts, in order to increase the public role in science and related policy, to amalgamate diverse and often contradictory evidence for a hypothesis of interest, and to achieve scientific consensus or at least the appearance of consensus among scientists. For consensus conferences that set out to amalgamate evidence, I propose three desiderata: Inclusivity (the consideration of all available evidence), Constraint (the achievement of some agreement of intersubjective assessments of the hypothesis of interest), and Evidential Complexity (the evaluation of available evidence based on a plurality of relevant evidential criteria). Two examples suggest that consensus conferences can readily satisfy Inclusivity and Evidential Complexity, but consensus conferences do not as easily satisfy Constraint. I end by discussing the relation between social inclusivity and the three desiderata.  相似文献   

3.
We examine the problem of aggregating several partitions of a finite set into a single consensus partition We note that the dual concepts of clustering and isolation are especially significant in this connection. The hypothesis that a consensus partition should respect unanimity with respect to either concept leads us to stress a consensus interval rather than a single partition. The extremes of this interval are characterized axiomatically. If a sufficient totality of traits has been measured, and if measurement errors are independent, then a true classifying partition can be expected to lie in the consensus interval. The structure of the partitions in the interval lends itself to partial solutions of the consensus problem Conditional entropy may be used to quantify the uncertainty inherent in the interval as a whole  相似文献   

4.
The paper presents methodology for analyzing a set of partitions of the same set of objects, by dividing them into classes of partitions that are similar to one another. Two different definitions are given for the consensus partition which summarizes each class of partitions. The classes are obtained using either constrained or unconstrained clustering algorithms. Two applications of the methodology are described.  相似文献   

5.
6.
A class of (multiple) consensus methods for n-trees (dendroids, hierarchical classifications) is studied. This class constitutes an extension of the so-called median consensus in the sense that we get two numbersm andm such that: If a clusterX occurs ink n-trees of a profileP, withk m, then it occurs in every consensus n-tree ofP. IfX occurs ink n-trees ofP, withm k <m, then it may, or may not, belong to a consensus n-tree ofP. IfX occurs ink n-trees ofP, withk <m then it cannot occur in any consensus n-tree ofP. If these conditions are satisfied, the multiconsensus function is said to be thresholded by the pair (m,m). Two results are obtained. The first one characterizes the pairs of numbers that can be viewed as thresholds for some consensus function. The second one provides a characterization of thresholded consensus methods. As an application a characterization of the quota rules is provided.
Resume Cet article traite d'une classe de méthodes de consensus (multiples) entre des classifications hiérarchiques. Cette classe est une généralisation du consensus médian dans las mesure oú elle est constituée des méthodes c pour lesquelles il existe deux nombresm etm tels que: Si une classeX appartient ák hiérarchies d'un profilP, aveck m, alorsX appartient á chaque hiérarchie consensus deP. SiX appartient ák hiérarchies deP, avecm k <m, alorsX, peut, ou non, appartenir à une hiérarchie consensus deP. SiX appartient àk hiérarchies deP, aveck <m, alorsX n'appartient á aucune hiérarchie consensus deP. On dit alors que le couple (m,m) est un seuil pour c. Deux résultats sont obtenus. Le premier caractérise les couples de nombres qui sont des seuils de consensus. Le second caractérise les consensus admettant un seuil. Une caractérisation de la régle des quotas est déduite de ce second résultat.
  相似文献   

7.
8.
The median procedure for n-trees as a maximum likelihood method   总被引:1,自引:1,他引:0  
A few axioms are presented which allow the median procedure for n-trees to be given a maximum likelihood interpretation.Research supported by grant number N00014-89-J-1643 from the Office of Naval Research. The author would like to thank the referees for their helpful comments.  相似文献   

9.
Two fundamental approaches to the comparison of classifications (e g, partitions on the same finite set of objects) can be distinguished One approach is based upon measures of metric dissimilarity while the other is based upon measures of similarity, or consensus These approaches are not necessarily simple complements of each other Instead, each captures different, limited views of comparison of two classifications The properties of these measures are clarified by their relationships to Day's complexity models and to association measures of numerical taxonomy The two approaches to comparison are equated with the use of separation and minimum value sensitive measures, suggesting the potential application of an intermediate sensitive measure to the problem of comparison of classifications Such a measure is a linear combination of separation sensitive and minimum value sensitive components The application of these intermediate measures is contrasted with the two extremes The intermediate measure for the comparison of classifications is applied to a problem of character weighting arising in the analysis of Australian stream basinsWe thank Bill Day, Mike Austin, Peter Minchin and two anonymous referees for many helpful comments We also thank P Arabie for useful discussion of consensus methods and character weighting  相似文献   

10.
Given two dendrograms (rooted tree diagrams) which have some but not all of their base points in common, a supertree is a dendrogram from which each of the original trees can be regarded as samples The distinction is made between inconsistent and consistent sample trees, defined by whether or not the samples provide contradictory information about the supertree An algorithm for obtaining the strict consensus supertree of two consistent sample trees is presented, as are procedures for merging two inconsistent sample trees Some suggestions for future work are made  相似文献   

11.
Many methods and algorithms to generate random trees of many kinds have been proposed in the literature. No procedure exists however for the generation of dendrograms with randomized fusion levels. Randomized dendrograms can be obtained by randomizing the associated cophenetic matrix. Two algorithms are described. The first one generates completely random dendrograms, i.e., trees with a random topology, random fusion level values, and random assignment of the labels. The second algorithm uses a double-permutation procedure to randomize a given dendrogram; it proceeds by randomization of the fixed fusion levels, instead of using random fusion level values. A proof is presented that the double-permutation procedure is a Uniform Random Generation Algorithmsensu Furnas (1984), and a complete example is given. This work was supported by NSERC Grant No. A7738 to P. Legendre and by a NSERC scholarship to F.-J. Lapointe.  相似文献   

12.
Classifications are generally pictured in the form of hierarchical trees, also called dendrograms. A dendrogram is the graphical representation of an ultrametric (=cophenetic) matrix; so dendrograms can be compared to one another by comparing their cophenetic matrices. Three methods used in testing the correlation between matrices corresponding to dendrograms are evaluated. The three permutational procedures make use of different aspects of the information to compare dendrograms: the Mantel procedure permutes label positions only; the binary tree methods randomize the topology as well; the double-permutation procedure is based on all the information included in a dendrogram, that is: topology, label positions, and cluster heights. Theoretical and empirical investigations of these methods are carried out to evaluate their relative performance. Simulations show that the Mantel test is too conservative when applied to the comparison of dendrograms; the methods of binary tree comparisons do slightly better; only the doublepermutation test provides unbiased type I error. Les arbres utilisés pour illustrés les groupements sont généralement représentés sous la forme de classifications hiérarchiques ou dendrogrammes. Un dendrogramme représente graphiquement l’information contenue dans la matrice ultramétrique (=cophénétique) correspondant à la classification. Dès ultramétriques correspondantes. Nous comparons trois méthodes permettant d’évaluer la signification statistique du coefficient de correlation mesuré entre deux matrices ultramétriques. Ces trois tests par permutations tiennent compte d’aspects différents pour comparer des dendrogrammes: le test de Mantel permute les feuilles de l’arbre, les méthodes pour arbres binaires permutent les feuilles et la topologie, alors que la procédure à double permutation permute les feuilles, la topologie et les niveaux de fusion des dendrogrammes comparés. L’efficacité relative des trois méthodes est évaluée empiriquement et théoriquement. Nos résultats suggèrent l’utilisation préférentielle du test à double permutation pour la comparaison de dendrogrammes: le test de Mantel s’avère trop conservateur, tandis que les méthodes pour arbres binaires ne sont pas toujours adéquates.
This work was supported by NSERC grant no. A7738 to Pierre Legendre and by a NSERC scholarship to F.-J. Lapointe.  相似文献   

13.
A consensus index method is an ordered pair consisting of a consensus method and a consensus index Day and McMorris (1985) have specified two minimal axioms, one which should be satisfied by the consensus method and the other by the consensus index The axiom for consensus indices is not satisfied by the s-consensus index In this paper, an additional axiom, which states that a consensus index equal to one implies profile unanimity, is proposed The s-consensus method together with a modification of the s-consensus index (i e, normalized by the number of distinct nontrivial clusters in the profile) is shown to satisfy the two axioms proposed by Day and McMorris and the new axiom  相似文献   

14.
This paper studies the random indexed dendograms produced by agglomerative hierarchical algorithms under the non-classifiability hypothesis of independent identically distributed (i.i.d.) dissimilarities. New tests for classifiability are deduced. The corresponding test statistics are random variables attached to the indexed dendrograms, such as the indices, the survival time of singletons, the value of the ultrametric between two given points, or the size of classes in the different levels of the dendogram. For an indexed dendogram produced by the Single Link method on i.i.d. dissimilarities, the distribution of these random variables is computed, thus leading to explicit tests. For the case of the Average and Complete Link methods, some asymptotic results are presented. The proofs rely essentially on the theory of random graphs.  相似文献   

15.
blocs and legislative measures are partitioned into types so that, as nearly as possible, votes by each bloc for each type of measure are either all YEAs or all NAYs. A probability model is given for the partitions into blocs and types, and for the pattern of YEAs and NAYs given the partitions. The Alternating Randomized Combination algorithm is presented for searching for high probability partition pairs. The probability of each bloc and type in the final optimal partition pair is estimated by Markov Chain Monte Carlo. The final partition identifies 18 blocs of Senators, and 14 types of legislative measures. The blocs and types are delineated in a table reporting all decisive votes in the 103rd Congress. The blocs are characterized by the types of measures in which they vote against the majority party.  相似文献   

16.
A Binary Integer Program to Maximize the Agreement Between Partitions   总被引:1,自引:1,他引:0  
This research note focuses on a problem where the cluster sizes for two partitions of the same object set are assumed known; however, the actual assignments of objects to clusters are unknown for one or both partitions. The objective is to find a contingency table that produces maximum possible agreement between the two partitions, subject to constraints that the row and column marginal frequencies for the table correspond exactly to the cluster sizes for the partitions. This problem was described by H. Messatfa (Journal of Classification, 1992, pp. 5–15), who provided a heuristic procedure based on the linear transportation problem. We present an exact solution procedure using binary integer programming. We demonstrate that our proposed method efficiently obtains optimal solutions for problems of practical size. We would like to thank the Editor, Willem Heiser, and an anonymous reviewer for helpful comments that resulted in improvements of this article.  相似文献   

17.
In this paper, we study a distance defined over the partitions of a finite set. Given two partitions P and Q, this distance is defined as the minimum number of transfers of an element from one class to another, required to transform P into Q. We recall the algorithm to evaluate this distance and we give some formulae for the maximum distance value between two partitions having exactly or at most p and q classes, for given p and q.  相似文献   

18.
Given two or more dendrograms (rooted tree diagrams) based on the same set of objects, ways are presented of defining and obtaining common pruned trees. Bounds on the size of a largest common pruned tree are introduced, as is a categorization of objects according to whether they belong to all, some, or no largest common pruned trees. Also described is a procedure for regrafting pruned branches, yielding trees for which one can assess the reliability of the depicted relationships. The tree obtained by regrafting branches on to a largest common pruned tree is shown to contain all the classes present in the strict consensus tree. The theory is illustrated by application to two classifications of a set of forty-nine stratigraphical pollen spectra.This work was supported by the Science and Engineering Research Council. The authors are grateful to the referees for constructive criticisms of an earlier version of the paper, and to Dr. J.T. Henderson for advice on PASCAL.  相似文献   

19.
Dendrograms used in data analysis are ultrametric spaces, hence objects of nonarchimedean geometry. It is known that there exist p-adic representations of dendrograms. Completed by a point at infinity, they can be viewed as subtrees of the Bruhat-Tits tree associated to the p-adic projective line. The implications are that certain moduli spaces known in algebraic geometry are in fact p-adic parameter spaces of dendrograms, and stochastic classification can also be handled within this framework. At the end, we calculate the topology of the hidden part of a dendrogram.  相似文献   

20.
An algorithm to maximize the agreement between partitions   总被引:2,自引:1,他引:1  
  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号