首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
This paper presents the development of a new methodology which simultaneously estimates in a least-squares fashion both an ultrametric tree and respective variable weightings for profile data that have been converted into (weighted) Euclidean distances. We first review the relevant classification literature on this topic. The new methodology is presented including the alternating least-squares algorithm used to estimate the parameters. The method is applied to a synthetic data set with known structure as a test of its operation. An application of this new methodology to ethnic group rating data is also discussed. Finally, extensions of the procedure to model additive, multiple, and three-way trees are mentioned.The first author is supported as Bevoegdverklaard Navorser of the Belgian Nationaal Fonds voor Wetenschappelijk Onderzoek.  相似文献   

2.
In quantum computation non classical features such as superposition states and entanglement are used to solve problems in new ways, impossible on classical digital computers.We illustrate by Deutsch algorithm how a quantum computer can use superposition states to outperform any classical computer. We comment on the view of a quantum computer as a massive parallel computer and recall Amdahls law for a classical parallel computer. We argue that the view on quantum computation as a massive parallel computation disregards the presence of entanglement in a general quantum computation and the non classical way in which parallel results are combined to obtain the final output.  相似文献   

3.
Ultrametric tree representations of incomplete dissimilarity data   总被引:2,自引:2,他引:0  
The least squares algorithm for fitting ultrametric trees to proximity data originally proposed by Carroll and Pruzansky and further elaborated by De Soete is extended to handle missing data. A Monte Carlo evaluation reveals that the algorithm is capable of recovering an ultrametric tree underlying an incomplete set of error-perturbed dissimilarities quite well.Geert De Soete is Aangesteld Navorser of the Belgian National Fonds voor Wetenschappelijk Onderzoek.  相似文献   

4.
Models for the representation of proximity data (similarities/dissimilarities) can be categorized into one of three groups of models: continuous spatial models, discrete nonspatial models, and hybrid models (which combine aspects of both spatial and discrete models). Multidimensional scaling models and associated methods, used for thespatial representation of such proximity data, have been devised to accommodate two, three, and higher-way arrays. At least one model/method for overlapping (but generally non-hierarchical) clustering called INDCLUS (Carroll and Arabie 1983) has been devised for the case of three-way arrays of proximity data. Tree-fitting methods, used for thediscrete network representation of such proximity data, have only thus far been devised to handle two-way arrays. This paper develops a new methodology called INDTREES (for INdividual Differences in TREE Structures) for fitting various(discrete) tree structures to three-way proximity data. This individual differences generalization is one in which different individuals, for example, are assumed to base their judgments on the same family of trees, but are allowed to have different node heights and/or branch lengths.We initially present an introductory overview focussing on existing two-way models. The INDTREES model and algorithm are then described in detail. Monte Carlo results for the INDTREES fitting of four different three-way data sets are presented. In the application, a single ultrametric tree is fitted to three-way proximity data derived from intention-to-buy-data for various brands of over-the-counter pain relievers for relieving three common types of maladies. Finally, we briefly describe how the INDTREES procedure can be extended to accommodate hybrid modelling, as well as to handle other types of applications.  相似文献   

5.
Consider N entities to be classified (e.g., geographical areas), a matrix of dissimilarities between pairs of entities, a graph H with vertices associated with these entities such that the edges join the vertices corresponding to contiguous entities. The split of a cluster is the smallest dissimilarity between an entity of this cluster and an entity outside of it. The single-linkage algorithm (ignoring contiguity between entities) provides partitions into M clusters for which the smallest split of the clusters, called split of the partition, is maximum. We study here the partitioning of the set of entities into M connected clusters for all M between N - 1 and 2 (i.e., clusters such that the subgraphs of H induced by their corresponding sets of entities are connected) with maximum split subject to that condition. We first provide an exact algorithm with a (N2) complexity for the particular case in which H is a tree. This algorithm suggests in turn a first heuristic algorithm for the general problem. Several variants of this heuristic are Also explored. We then present an exact algorithm for the general case based on iterative determination of cocycles of subtrees and on the solution of auxiliary set covering problems. As solution of the latter problems is time-consuming for large instances, we provide another heuristic in which the auxiliary set covering problems are solved approximately. Computational results obtained with the exact and heuristic algorithms are presented on test problems from the literature.  相似文献   

6.
We argue from the Church-Turing thesis (Kleene Mathematical logic. New York: Wiley 1967) that a program can be considered as equivalent to a formal language similar to predicate calculus where predicates can be taken as functions. We can relate such a calculus to Wittgenstein’s first major work, the Tractatus, and use the Tractatus and its theses as a model of the formal classical definition of a computer program. However, Wittgenstein found flaws in his initial great work and he explored these flaws in a new thesis described in his second great work; the Philosophical Investigations. The question we address is “can computer science make the same leap?” We are proposing, because of the flaws identified by Wittgenstein, that computers will never have the possibility of natural communication with people unless they become active participants of human society. The essential difference between formal models used in computing and human communication is that formal models are based upon rational sets whereas people are not so restricted. We introduce irrational sets as a concept that requires the use of an abductive inference system. However, formal models are still considered central to our means of using hypotheses through deduction to make predictions about the world. These formal models are required to continually be updated in response to peoples’ changes in their way of seeing the world. We propose that one mechanism used to keep track of these changes is the Peircian abductive loop.  相似文献   

7.
Data holders, such as statistical institutions and financial organizations, have a very serious and demanding task when producing data for official and public use. It’s about controlling the risk of identity disclosure and protecting sensitive information when they communicate data-sets among themselves, to governmental agencies and to the public. One of the techniques applied is that of micro-aggregation. In a Bayesian setting, micro-aggregation can be viewed as the optimal partitioning of the original data-set based on the minimization of an appropriate measure of discrepancy, or distance, between two posterior distributions, one of which is conditional on the original data-set and the other conditional on the aggregated data-set. Assuming d-variate normal data-sets and using several measures of discrepancy, it is shown that the asymptotically optimal equal probability m-partition of , with m 1/d ∈ , is the convex one which is provided by hypercubes whose sides are formed by hyperplanes perpendicular to the canonical axes, no matter which discrepancy measure has been used. On the basis of the above result, a method that produces a sub-optimal partition with a very small computational cost is presented. Published online xx, xx, xxxx.  相似文献   

8.
A class of (multiple) consensus methods for n-trees (dendroids, hierarchical classifications) is studied. This class constitutes an extension of the so-called median consensus in the sense that we get two numbersm andm such that: If a clusterX occurs ink n-trees of a profileP, withk m, then it occurs in every consensus n-tree ofP. IfX occurs ink n-trees ofP, withm k <m, then it may, or may not, belong to a consensus n-tree ofP. IfX occurs ink n-trees ofP, withk <m then it cannot occur in any consensus n-tree ofP. If these conditions are satisfied, the multiconsensus function is said to be thresholded by the pair (m,m). Two results are obtained. The first one characterizes the pairs of numbers that can be viewed as thresholds for some consensus function. The second one provides a characterization of thresholded consensus methods. As an application a characterization of the quota rules is provided.
Resume Cet article traite d'une classe de méthodes de consensus (multiples) entre des classifications hiérarchiques. Cette classe est une généralisation du consensus médian dans las mesure oú elle est constituée des méthodes c pour lesquelles il existe deux nombresm etm tels que: Si une classeX appartient ák hiérarchies d'un profilP, aveck m, alorsX appartient á chaque hiérarchie consensus deP. SiX appartient ák hiérarchies deP, avecm k <m, alorsX, peut, ou non, appartenir à une hiérarchie consensus deP. SiX appartient àk hiérarchies deP, aveck <m, alorsX n'appartient á aucune hiérarchie consensus deP. On dit alors que le couple (m,m) est un seuil pour c. Deux résultats sont obtenus. Le premier caractérise les couples de nombres qui sont des seuils de consensus. Le second caractérise les consensus admettant un seuil. Une caractérisation de la régle des quotas est déduite de ce second résultat.
  相似文献   

9.
Designing models of complex phenomena is a difficult task in engineering that can be tackled by composing a number of partial models to produce a global model of the phenomena. We propose to embed the partial models in software agents and to implement their composition as a cooperative negotiation between the agents. The resulting multiagent system provides a global model of a phenomenon. We applied this approach in modelling two complex physiological processes: the heart rate regulation and the glucose-insulin metabolism. Beyond the effectiveness demonstrated in these two applications, the idea of using models associated to software agents to give reason of complex phenomena is in accordance with current tendencies in epistemology, where it is evident an increasing use of computational models for scientific explanation and analysis. Therefore, our approach has not only a practical, but also a theoretical significance: agents embedding models are a technology suitable both to representing and to investigating reality.
Francesco AmigoniEmail:
  相似文献   

10.
We review methods of qualitative factor analysis (QFA) developed by the author and his collaborators over the last decade and discuss the use of QFA methods for the additive clustering problem. The QFA method includes, first, finding a square Boolean matrix in a fixed set of Boolean matrices with simple structures to approximate a given similarity matrix, and, second, repeating this process again and again using residual similarity matrices. We present convergence properties for three versions of the method, provide cluster interpretations for results obtained from the algorithms, and give formulas for the evaluation of factor shares of the initial similarities variance.I am indebted to Professor P. Arabie and the referees for valuable comments and editing of the text.  相似文献   

11.
Five different methods for obtaining a rational initial estimate of the stimulus space in the INDSCAL model were compared using the SINDSCAL program for fitting INDSCAL. The effect of the number of stimuli, the number of subjects, the dimensionality, and the amount of error on the quality and efficiency of the final SINDSCAL solution were investigated in a Monte Carlo study. We found that the quality of the final solution was not affected by the choice of the initialization method, suggesting that SINDSCAL finds a global optimum regardless of the initialization method used. The most efficient procedures were the methods proposed by by de Leeuw and Pruzansky (1978) and by Flury and Gautschi (1986) for the simultaneous diagonalization of several positive definite symmetric matrices, and a method based on linearly constraining the stimulus space using the CANDELINC approach developed by Carroll, Pruzansky, and Kruskal (1980).Geert De Soete is supported as Bevoegdverklaard Navorser of the Belgian Nationaal Fonds voor Wetenschappelijk Onderzoek. The authors gratefully acknowledge the helpful comments and suggestions of the reviewers.  相似文献   

12.
13.
In this paper we describe in some detail a formal computer model of inferential discourse based on a belief system. The key issue is that a logical model in a computer, based on rational sets, can usefully model a human situation based on irrational sets. The background of this work is explained elsewhere, as is the issue of rational and irrational sets (Billinge and Addis, in: Magnani and Dossena (eds.), Computing, philosophy and cognition, 2004; Stepney et al., Journey: Non-classical philosophy—socially sensitive computing in journeys non-classical computation: A grand challenge for computing research, 2004). The model is based on the Belief System (Addis and Gooding, Proceedings of the AISB’99 Symposium on Scientific Creativity, 1999) and it provides a mechanism for choosing queries based on a range of belief. We explain how it provides a way to update the belief based on query results, thus modelling others’ experience by inference. We also demonstrate that for the same internal experience, different models can be built for different actors.
Tom AddisEmail:
  相似文献   

14.
Syntactic and structural models specify relationships between their constituents but cannot show what outcomes their interaction would produce over time in the world. Simulation consists in iterating the states of a model, so as to produce behaviour over a period of simulated time. Iteration enables us to trace the implications and outcomes of inference rules and other assumptions implemented in the models that make up a theory. We apply this method to experiments which we treat as models of the particular aspects of reality they are designed to investigate. Scientific experiments are constantly designed and re-designed in the context of implementation and use. They mediate between theoretical understanding and the practicalities of engaging with the empirical and social world. In order to model experiments we need to identify and represent features that all experiments have in common. We treat these features as parameters of a general model of experiment so that by varying these parameters different types of experiment can be modelled.
D. C. GoodingEmail:
  相似文献   

15.
In multivariate discrimination of several normal populations, the optimal classification procedure is based on quadratic discriminant functions. We compare expected error rates of the quadratic classification procedure if the covariance matrices are estimated under the following four models: (i) arbitrary covariance matrices, (ii) common principal components, (iii) proportional covariance matrices, and (iv) identical covariance matrices. Using Monte Carlo simulation to estimate expected error rates, we study the performance of the four discrimination procedures for five different parameter setups corresponding to standard situations that have been used in the literature. The procedures are examined for sample sizes ranging from 10 to 60, and for two to four groups. Our results quantify the extent to which a parsimonious method reduces error rates, and demonstrate that choosing a simple method of discrimination is often beneficial even if the underlying model assumptions are wrong.The authors wish to thank the editor and three referees for their helpful comments on the first draft of this article. M. J. Schmid supported by grants no. 2.724-0.85 and 2.038-0.86 of the Swiss National Science Foundation.  相似文献   

16.
This paper presents a general approach for fitting the ADCLUS (Shepard and Arabie 1979; Arabie, Carroll, DeSarbo, and Wind 1981), INDCLUS (Carroll and Arabie 1983), and potentially a special case of the GENNCLUS (DeSarbo 1982) models. The proposed approach, based largely on a separability property observed for the least squares loss function being optimized, offers increased efficiency and other advantages over existing approaches like MAPCLUS (Arabie and Carroll 1980) for fitting the ADCLUS model, and the INDCLUS method for fitting the INDCLUS model. The new procedure (called SINDCLUS) is applied to three sets of empirical data to demonstrate the effectiveness of the SINDCLUS methodology. Finally, some potentially useful extensions are discussed.  相似文献   

17.
Minimum sum of diameters clustering   总被引:1,自引:1,他引:0  
The problem of determining a partition of a given set ofN entities intoM clusters such that the sum of the diameters of these clusters is minimum has been studied by Brucker (1978). He proved that it is NP-complete forM3 and mentioned that its complexity was unknown forM=2. We provide anO(N 3 logN) algorithm for this latter case. Moreover, we show that determining a partition into two clusters which minimizes any given function of the diameters can be done inO(N 5) time.Acknowledgments: This research was supported by the Air Force Office of Scientific Research Grant AFOSR 0271 to Rutgers University. We are grateful to Yves Crama for several insightful remarks and to an anonymous referee for detailed comments.  相似文献   

18.
In this paper we want to examine how the mutual understanding of speakers is reached during a conversation through collaborative processes, and what role is played by abductive inference (in the Peircean sense) in these processes. We do this by bringing together contributions coming from a variety of disciplines, such as logic, philosophy of language and psychology. When speakers are engaged in a conversation, they refer to a supposed common ground: every participant ascribes to the others some knowledge, belief, opinion etc. on which to rely in order to reach mutual understanding. As the conversation unfolds, this common ground is continually corrected and reshaped by the interchanges. An abductive reasoning takes place, in a collaborative setting, in order to build new possible theories about the common ground. In reconstructing this process through the use of a working example, we argue that the integration of a collaborative perspective within the Peircean theory of abduction can help to solve some of the drawbacks that the critics of the latter have outlined, for example its permissivity and non generativity.
Roberta FerrarioEmail:
  相似文献   

19.
This paper proposes a measure of spatial homogeneity for sets of d-dimensional points based on nearest neighbor distances. Tests for spatial uniformity are examined which assess the tendency of the entire data set to aggregate and evaluate the character of individual clusters. The sizes and powers of three statistical tests of uniformity against aggregation, regularity, and unimodality are studied to determine robustness. The paper also studies the effects of normalization and incorrect prior information. A percentile frame sampling procedure is proposed that does not require a sampling window but is superior to a toroidal frame and to buffer zone sampling in particular situations. Examples test two data sets for homogeneity and search the results of a hierarchical clustering for homogeneous clusters.This work was partially supported by NSF Grant ECS-8300204.  相似文献   

20.
A mathematical programming algorithm is developed for fitting ultrametric or additive trees to proximity data where external constraints are imposed on the topology of the tree. The two procedures minimize a least squares loss function. The method is illustrated on both synthetic and real data. A constrained ultrametric tree analysis was performed on similarities between 32 subjects based on preferences for ten odors, while a constrained additive tree analysis was carried out on some proximity data between kinship terms. Finally, some extensions of the methodology to other tree fitting procedures are mentioned.The first author is supported as Bevoegdverklaard Navorser of the Belgian Nationaal Fonds voor Wetenschappelijk Onderzoek.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号