共查询到20条相似文献,搜索用时 187 毫秒
1.
On the use of ordered sets in problems of comparison and consensus of classifications 总被引:7,自引:7,他引:0
Ordered set theory provides efficient tools for the problems of comparison and consensus of classifications Here, an overview of results obtained by the ordinal approach is presented Latticial or semilatticial structures of the main sets of classification models are described Many results on partitions are adaptable to dendrograms; many results on n-trees hold in any median semilattice and thus have counterparts on ordered trees and Buneman (phylogenetic) trees For the comparison of classifications, the semimodularity of the ordinal structures involved yields computable least-move metrics based on weighted or unweighted elementary transformations In the unweighted case, these metrics have simple characteristic properties For the consensus of classifications, the constructive, axiomatic, and optimization approaches are considered Natural consensus rules (majoritary, oligarchic, ) have adequate ordinal formalizations A unified presentation of Arrow-like characterization results is given In the cases of n-trees, ordered trees and Buneman trees, the majority rule is a significant example where the three approaches convergeThe authors would like to thank the anonymous referees for helpful suggestions on the first draft of this paper, and W H E Day for his comments and his significant improvements of style 相似文献
2.
Jacob Stegenga 《Foundations of Science》2016,21(1):35-49
Consensus conferences are social techniques which involve bringing together a group of scientific experts, and sometimes also non-experts, in order to increase the public role in science and related policy, to amalgamate diverse and often contradictory evidence for a hypothesis of interest, and to achieve scientific consensus or at least the appearance of consensus among scientists. For consensus conferences that set out to amalgamate evidence, I propose three desiderata: Inclusivity (the consideration of all available evidence), Constraint (the achievement of some agreement of intersubjective assessments of the hypothesis of interest), and Evidential Complexity (the evaluation of available evidence based on a plurality of relevant evidential criteria). Two examples suggest that consensus conferences can readily satisfy Inclusivity and Evidential Complexity, but consensus conferences do not as easily satisfy Constraint. I end by discussing the relation between social inclusivity and the three desiderata. 相似文献
3.
We examine the problem of aggregating several partitions of a finite set into a single consensus partition We note that the dual concepts of clustering and isolation are especially significant in this connection. The hypothesis that a consensus partition should respect unanimity with respect to either concept leads us to stress a consensus interval rather than a single partition. The extremes of this interval are characterized axiomatically. If a sufficient totality of traits has been measured, and if measurement errors are independent, then a true classifying partition can be expected to lie in the consensus interval. The structure of the partitions in the interval lends itself to partial solutions of the consensus problem Conditional entropy may be used to quantify the uncertainty inherent in the interval as a whole 相似文献
4.
The paper presents methodology for analyzing a set of partitions of the same set of objects, by dividing them into classes
of partitions that are similar to one another. Two different definitions are given for the consensus partition which summarizes
each class of partitions. The classes are obtained using either constrained or unconstrained clustering algorithms. Two
applications of the methodology are described. 相似文献
5.
6.
Jean-Pierre Barthélemy 《Journal of Classification》1988,5(2):229-236
A class of (multiple) consensus methods for n-trees (dendroids, hierarchical classifications) is studied. This class constitutes an extension of the so-called median consensus in the sense that we get two numbersm andm such that: If a clusterX occurs ink n-trees of a profileP, withk m, then it occurs in every consensus n-tree ofP. IfX occurs ink n-trees ofP, withm k <m, then it may, or may not, belong to a consensus n-tree ofP. IfX occurs ink n-trees ofP, withk <m then it cannot occur in any consensus n-tree ofP. If these conditions are satisfied, the multiconsensus function is said to be thresholded by the pair (m,m). Two results are obtained. The first one characterizes the pairs of numbers that can be viewed as thresholds for some consensus function. The second one provides a characterization of thresholded consensus methods. As an application a characterization of the quota rules is provided.
Resume Cet article traite d'une classe de méthodes de consensus (multiples) entre des classifications hiérarchiques. Cette classe est une généralisation du consensus médian dans las mesure oú elle est constituée des méthodes c pour lesquelles il existe deux nombresm etm tels que: Si une classeX appartient ák hiérarchies d'un profilP, aveck m, alorsX appartient á chaque hiérarchie consensus deP. SiX appartient ák hiérarchies deP, avecm k <m, alorsX, peut, ou non, appartenir à une hiérarchie consensus deP. SiX appartient àk hiérarchies deP, aveck <m, alorsX n'appartient á aucune hiérarchie consensus deP. On dit alors que le couple (m,m) est un seuil pour c. Deux résultats sont obtenus. Le premier caractérise les couples de nombres qui sont des seuils de consensus. Le second caractérise les consensus admettant un seuil. Une caractérisation de la régle des quotas est déduite de ce second résultat.相似文献
7.
8.
The median procedure for n-trees as a maximum likelihood method 总被引:1,自引:1,他引:0
F. R. McMorris 《Journal of Classification》1990,7(1):77-80
A few axioms are presented which allow the median procedure for n-trees to be given a maximum likelihood interpretation.Research supported by grant number N00014-89-J-1643 from the Office of Naval Research. The author would like to thank the referees for their helpful comments. 相似文献
9.
Two fundamental approaches to the comparison of classifications (e g, partitions on the same finite set of objects) can be distinguished One approach is based upon measures of metric dissimilarity while the other is based upon measures of similarity, or consensus These approaches are not necessarily simple complements of each other Instead, each captures different, limited views of comparison of two classifications The properties of these measures are clarified by their relationships to Day's complexity models and to association measures of numerical taxonomy The two approaches to comparison are equated with the use of separation and minimum value sensitive measures, suggesting the potential application of an intermediate sensitive measure to the problem of comparison of classifications Such a measure is a linear combination of separation sensitive and minimum value sensitive components The application of these intermediate measures is contrasted with the two extremes The intermediate measure for the comparison of classifications is applied to a problem of character weighting arising in the analysis of Australian stream basinsWe thank Bill Day, Mike Austin, Peter Minchin and two anonymous referees for many helpful comments We also thank P Arabie for useful discussion of consensus methods and character weighting 相似文献
10.
Consensus supertrees: The synthesis of rooted trees containing overlapping sets of labeled leaves 总被引:2,自引:2,他引:0
AD Gordon 《Journal of Classification》1986,3(2):335-348
Given two dendrograms (rooted tree diagrams) which have some but not all of their base points in common, a supertree is a dendrogram from which each of the original trees can be regarded as samples The distinction is made between inconsistent and consistent sample trees, defined by whether or not the samples provide contradictory information about the supertree An algorithm for obtaining the strict consensus supertree of two consistent sample trees is presented, as are procedures for merging two inconsistent sample trees Some suggestions for future work are made 相似文献
11.
Many methods and algorithms to generate random trees of many kinds have been proposed in the literature. No procedure exists
however for the generation of dendrograms with randomized fusion levels. Randomized dendrograms can be obtained by randomizing
the associated cophenetic matrix. Two algorithms are described. The first one generates completely random dendrograms, i.e.,
trees with a random topology, random fusion level values, and random assignment of the labels. The second algorithm uses a
double-permutation procedure to randomize a given dendrogram; it proceeds by randomization of the fixed fusion levels, instead
of using random fusion level values. A proof is presented that the double-permutation procedure is a Uniform Random Generation
Algorithmsensu Furnas (1984), and a complete example is given.
This work was supported by NSERC Grant No. A7738 to P. Legendre and by a NSERC scholarship to F.-J. Lapointe. 相似文献
12.
Classifications are generally pictured in the form of hierarchical trees, also called dendrograms. A dendrogram is the graphical
representation of an ultrametric (=cophenetic) matrix; so dendrograms can be compared to one another by comparing their cophenetic
matrices. Three methods used in testing the correlation between matrices corresponding to dendrograms are evaluated. The three
permutational procedures make use of different aspects of the information to compare dendrograms: the Mantel procedure permutes
label positions only; the binary tree methods randomize the topology as well; the double-permutation procedure is based on
all the information included in a dendrogram, that is: topology, label positions, and cluster heights. Theoretical and empirical
investigations of these methods are carried out to evaluate their relative performance. Simulations show that the Mantel test
is too conservative when applied to the comparison of dendrograms; the methods of binary tree comparisons do slightly better;
only the doublepermutation test provides unbiased type I error.
Les arbres utilisés pour illustrés les groupements sont généralement représentés sous la forme de classifications hiérarchiques
ou dendrogrammes. Un dendrogramme représente graphiquement l’information contenue dans la matrice ultramétrique (=cophénétique)
correspondant à la classification. Dès ultramétriques correspondantes. Nous comparons trois méthodes permettant d’évaluer
la signification statistique du coefficient de correlation mesuré entre deux matrices ultramétriques. Ces trois tests par
permutations tiennent compte d’aspects différents pour comparer des dendrogrammes: le test de Mantel permute les feuilles
de l’arbre, les méthodes pour arbres binaires permutent les feuilles et la topologie, alors que la procédure à double permutation
permute les feuilles, la topologie et les niveaux de fusion des dendrogrammes comparés. L’efficacité relative des trois méthodes
est évaluée empiriquement et théoriquement. Nos résultats suggèrent l’utilisation préférentielle du test à double permutation
pour la comparaison de dendrogrammes: le test de Mantel s’avère trop conservateur, tandis que les méthodes pour arbres binaires
ne sont pas toujours adéquates.
This work was supported by NSERC grant no. A7738 to Pierre Legendre and by a NSERC scholarship to F.-J. Lapointe. 相似文献
This work was supported by NSERC grant no. A7738 to Pierre Legendre and by a NSERC scholarship to F.-J. Lapointe. 相似文献
13.
Ralph Stinebrickner 《Journal of Classification》1986,3(2):319-327
A consensus index method is an ordered pair consisting of a consensus method and a consensus index Day and McMorris (1985) have specified two minimal axioms, one which should be satisfied by the consensus method and the other by the consensus index The axiom for consensus indices is not satisfied by the s-consensus index In this paper, an additional axiom, which states that a consensus index equal to one implies profile unanimity, is proposed The s-consensus method together with a modification of the s-consensus index (i e, normalized by the number of distinct nontrivial clusters in the profile) is shown to satisfy the two axioms proposed by Day and McMorris and the new axiom 相似文献
14.
This paper studies the random indexed dendograms produced by agglomerative hierarchical algorithms under the non-classifiability
hypothesis of independent identically distributed (i.i.d.) dissimilarities. New tests for classifiability are deduced. The
corresponding test statistics are random variables attached to the indexed dendrograms, such as the indices, the survival
time of singletons, the value of the ultrametric between two given points, or the size of classes in the different levels
of the dendogram. For an indexed dendogram produced by the Single Link method on i.i.d. dissimilarities, the distribution
of these random variables is computed, thus leading to explicit tests. For the case of the Average and Complete Link methods,
some asymptotic results are presented. The proofs rely essentially on the theory of random graphs. 相似文献
15.
J. A. Hartigan 《Journal of Classification》2000,17(1):29-49
blocs and legislative measures are partitioned into types so that, as nearly as possible, votes by each bloc for each type of measure are either all YEAs or all NAYs. A probability
model is given for the partitions into blocs and types, and for the pattern of YEAs and NAYs given the partitions. The Alternating
Randomized Combination algorithm is presented for searching for high probability partition pairs. The probability of each
bloc and type in the final optimal partition pair is estimated by Markov Chain Monte Carlo. The final partition identifies
18 blocs of Senators, and 14 types of legislative measures. The blocs and types are delineated in a table reporting all decisive
votes in the 103rd Congress. The blocs are characterized by the types of measures in which they vote against the majority
party. 相似文献
16.
This research note focuses on a problem where the cluster sizes for two partitions of the same object set are assumed known;
however, the actual assignments of objects to clusters are unknown for one or both partitions. The objective is to find a
contingency table that produces maximum possible agreement between the two partitions, subject to constraints that the row
and column marginal frequencies for the table correspond exactly to the cluster sizes for the partitions. This problem was
described by H. Messatfa (Journal of Classification, 1992, pp. 5–15), who provided a heuristic procedure based on the linear transportation problem. We present an exact solution
procedure using binary integer programming. We demonstrate that our proposed method efficiently obtains optimal solutions
for problems of practical size.
We would like to thank the Editor, Willem Heiser, and an anonymous reviewer for helpful comments that resulted in improvements
of this article. 相似文献
17.
Irene Charon Lucile Denoeud Alain Guenoche Olivier Hudry 《Journal of Classification》2006,23(1):103-121
In this paper, we study a distance defined over the partitions of a finite set. Given two partitions P and Q, this distance
is defined as the minimum number of transfers of an element from one class to another, required to transform P into Q. We
recall the algorithm to evaluate this distance and we give some formulae for the maximum distance value between two partitions
having exactly or at most p and q classes, for given p and q. 相似文献
18.
Given two or more dendrograms (rooted tree diagrams) based on the same set of objects, ways are presented of defining and obtaining common pruned trees. Bounds on the size of a largest common pruned tree are introduced, as is a categorization of objects according to whether they belong to all, some, or no largest common pruned trees. Also described is a procedure for regrafting pruned branches, yielding trees for which one can assess the reliability of the depicted relationships. The tree obtained by regrafting branches on to a largest common pruned tree is shown to contain all the classes present in the strict consensus tree. The theory is illustrated by application to two classifications of a set of forty-nine stratigraphical pollen spectra.This work was supported by the Science and Engineering Research Council. The authors are grateful to the referees for constructive criticisms of an earlier version of the paper, and to Dr. J.T. Henderson for advice on PASCAL. 相似文献
19.
Patrick Erik Bradley 《Journal of Classification》2008,25(1):27-42
Dendrograms used in data analysis are ultrametric spaces, hence objects of nonarchimedean geometry. It is known that there exist p-adic representations of dendrograms. Completed by a point at infinity, they can be viewed as subtrees of the Bruhat-Tits tree associated to the p-adic projective line. The implications are that certain moduli spaces known in algebraic geometry are in fact p-adic parameter spaces of dendrograms, and stochastic classification can also be handled within this framework. At the end, we calculate the topology of the hidden part of a dendrogram. 相似文献
20.
An algorithm to maximize the agreement between partitions 总被引:2,自引:1,他引:1
H. Messatfa 《Journal of Classification》1992,9(1):5-15