首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 375 毫秒
1.
The INDSCAL individual differences scaling model is extended by assuming dimensions specific to each stimulus or other object, as well as dimensions common to all stimuli or objects. An alternating maximum likelihood procedure is used to seek maximum likelihood estimates of all parameters of this EXSCAL (Extended INDSCAL) model, including parameters of monotone splines assumed in a quasi-nonmetric approach. The rationale for and numerical details of this approach are described and discussed, and the resulting EXSCAL method is illustrated on some data on perception of musical timbres.  相似文献   

2.
A latent class probit model for analyzing pick any/N data   总被引:3,自引:3,他引:0  
A latent class probit model is developed in which it is assumed that the binary data of a particular subject follow a finite mixture of multivariate Bermoulli distributions. An EM algorithm for fitting the model is described and a Monte Carlo procedure for testing the number of latent classes that is required for adequately describing the data is discussed. In the final section, an application of the latent class probit model to some intended purchase data for residential telecommunication devices is reported. Geert De Soete is supported as “Bevoegdverklaard Navorser” of the Belgian “Nationaal Fonds voor Wetenschappelijk Onderzoek.”  相似文献   

3.
Five different methods for obtaining a rational initial estimate of the stimulus space in the INDSCAL model were compared using the SINDSCAL program for fitting INDSCAL. The effect of the number of stimuli, the number of subjects, the dimensionality, and the amount of error on the quality and efficiency of the final SINDSCAL solution were investigated in a Monte Carlo study. We found that the quality of the final solution was not affected by the choice of the initialization method, suggesting that SINDSCAL finds a global optimum regardless of the initialization method used. The most efficient procedures were the methods proposed by by de Leeuw and Pruzansky (1978) and by Flury and Gautschi (1986) for the simultaneous diagonalization of several positive definite symmetric matrices, and a method based on linearly constraining the stimulus space using the CANDELINC approach developed by Carroll, Pruzansky, and Kruskal (1980).Geert De Soete is supported as Bevoegdverklaard Navorser of the Belgian Nationaal Fonds voor Wetenschappelijk Onderzoek. The authors gratefully acknowledge the helpful comments and suggestions of the reviewers.  相似文献   

4.
A latent class vector model for preference ratings   总被引:1,自引:1,他引:1  
A latent class formulation of the well-known vector model for preference data is presented. Assuming preference ratings as input data, the model simultaneously clusters the subjects into a small number of homogeneous groups (or latent classes) and constructs a joint geometric representation of the choice objects and the latent classes according to a vector model. The distributional assumptions on which the latent class approach is based are analogous to the distributional assumptions that are consistent with the common practice of fitting the vector model to preference data by least squares methods. An EM algorithm for fitting the latent class vector model is described as well as a procedure for selecting the appropriate number of classes and the appropriate number of dimensions. Some illustrative applications of the latent class vector model are presented and some possible extensions are discussed. Geert De Soete is supported as “Bevoegdverklaard Navorser” of the Belgian “Nationaal Fonds voor Wetenschappelijk Onderzoek.”  相似文献   

5.
Carroll and Chang have derived the symmetric CANDECOMP model from the INDSCAL model, to fit symmetric matrices of approximate scalar products in the least squares sense. Typically, the CANDECOMP algorithm is used to estimate the parameters. In the present paper it is shown that negative weights may occur with CANDECOMP. This phenomenon can be suppressed by updating the weights by the Nonnegative Least Squares Algorithm. A potential drawback of the resulting procedure is that it may produce two different versions of the stimulus space matrix. To obviate this possibility, a symmetry preserving algorithm is offered, which can be monitored to produce non-negative weights as well. This work was partially supported by the Royal Netherlands Academy of Arts and Sciences.  相似文献   

6.
Bayesian classification is currently of considerable interest. It provides a strategy for eliminating the uncertainty associated with a particular choice of classifiermodel parameters, and is the optimal decision-theoretic choice under certain circumstances when there is no single “true” classifier for a given data set. Modern computing capabilities can easily support the Markov chain Monte Carlo sampling that is necessary to carry out the calculations involved, but the information available in these samples is not at present being fully utilised. We show how it can be allied to known results concerning the “reject option” in order to produce an assessment of the confidence that can be ascribed to particular classifications, and how these confidence measures can be used to compare the performances of classifiers. Incorporating these confidence measures can alter the apparent ranking of classifiers as given by straightforward success or error rates. Several possible methods for obtaining confidence assessments are described, and compared on a range of data sets using the Bayesian probabilistic nearest-neighbour classifier.  相似文献   

7.
Ultrametric tree representations of incomplete dissimilarity data   总被引:2,自引:2,他引:0  
The least squares algorithm for fitting ultrametric trees to proximity data originally proposed by Carroll and Pruzansky and further elaborated by De Soete is extended to handle missing data. A Monte Carlo evaluation reveals that the algorithm is capable of recovering an ultrametric tree underlying an incomplete set of error-perturbed dissimilarities quite well.Geert De Soete is Aangesteld Navorser of the Belgian National Fonds voor Wetenschappelijk Onderzoek.  相似文献   

8.
A clustering that consists of a nested set of clusters may be represented graphically by a tree. In contrast, a clustering that includes non-nested overlapping clusters (sometimes termed a “nonhierarchical” clustering) cannot be represented by a tree. Graphical representations of such non-nested overlapping clusterings are usually complex and difficult to interpret. Carroll and Pruzansky (1975, 1980) suggested representing non-nested clusterings with multiple ultrametric or additive trees. Corter and Tversky (1986) introduced the extended tree (EXTREE) model, which represents a non-nested structure as a tree plus overlapping clusters that are represented by marked segments in the tree. We show here that the problem of finding a nested (i.e., tree-structured) set of clusters in an overlapping clustering can be reformulated as the problem of finding a clique in a graph. Thus, clique-finding algorithms can be used to identify sets of clusters in the solution that can be represented by trees. This formulation provides a means of automatically constructing a multiple tree or extended tree representation of any non-nested clustering. The method, called “clustrees”, is applied to several non-nested overlapping clusterings derived using the MAPCLUS program (Arabie and Carroll 1980).  相似文献   

9.
The issue of determining “the right number of clusters” in K-Means has attracted considerable interest, especially in the recent years. Cluster intermix appears to be a factor most affecting the clustering results. This paper proposes an experimental setting for comparison of different approaches at data generated from Gaussian clusters with the controlled parameters of between- and within-cluster spread to model cluster intermix. The setting allows for evaluating the centroid recovery on par with conventional evaluation of the cluster recovery. The subjects of our interest are two versions of the “intelligent” K-Means method, ik-Means, that find the “right” number of clusters by extracting “anomalous patterns” from the data one-by-one. We compare them with seven other methods, including Hartigan’s rule, averaged Silhouette width and Gap statistic, under different between- and within-cluster spread-shape conditions. There are several consistent patterns in the results of our experiments, such as that the right K is reproduced best by Hartigan’s rule – but not clusters or their centroids. This leads us to propose an adjusted version of iK-Means, which performs well in the current experiment setting.  相似文献   

10.
After sketching the historical development of “emergence” and noting several recent problems relating to “emergent properties”, this essay proposes that properties may be either “emergent” or “mergent” and either “intrinsic” or “extrinsic”. These two distinctions define four basic types of change: stagnation, permanence, flux, and evolution. To illustrate how emergence can operate in a purely logical system, the Geometry of Logic is introduced. This new method of analyzing conceptual systems involves the mapping of logical relations onto geometrical figures, following either an analytic or a synthetic pattern (or both together). Evolution is portrayed as a form of discontinuous change characterized by emergent properties that take on an intrinsic quality with respect to the object(s) or proposition(s) involved. Causal leaps, not continuous development, characterize the evolution of human life in a developing foetus, of a thought out of certain brain states, of a new idea (or insight) out of ordinary thoughts, and of a great person out of a set of historical experiences. The tendency to assume that understanding evolutionary change requires a step-by-step explanation of the historical development that led to the appearance of a certain emergent property is thereby discredited.  相似文献   

11.
The Kohonen self-organizing map method: An assessment   总被引:1,自引:0,他引:1  
The “self-organizing map” method, due to Kohonen, is a well-known neural network method. It is closely related to cluster analysis (partitioning) and other methods of data analysis. In this article, we explore some of these close relationships. A number of properties of the technique are discussed. Comparisons with various methods of data analysis (principal components analysis, k-means clustering, and others) are presented. This work has been partially supported for M. Hernández-Pajares by the DGCICIT of Spain under grant No. PB90-0478 and by a CESCA-1993 computer-time grant. Fionn Murtagh is affiliated to the Astrophysics Division, Space Science Department, European Space Agency.  相似文献   

12.
Taxonomy Based modeling was applied to describe drivers’ mental models of variable message signs (VMS’s) displayed on expressways. Progress in road telematics has made it possible to introduce variable message signs (VMS’s). Sensors embedded in the carriageway every 500m record certain variables (speed, flow rate, etc.) that are transformed in real time into “driving times” to a given destination if road conditions do not change. VMS systems are auto-regulative Man-Machine (AMMI) systems which incorporate a model of the user: if the traffic flow is too high, then drivers should choose alternative routes. In so doing, the traffic flow should decrease. The model of the user is based on suppositions such as: people do not like to waste time, they fully understand the displayed messages, they trust the displayed values, they know of alternative routes. However, people also have a model of the way the system functions. And if they do not believe the contents of the message, they will not act as expected. We collected data through interviews with drivers using the critical incidents technique (Flanagan, 1985). Results show that the mental models that drivers have of the way the VMS system works are various but not numerous and that most of them differ from the“ideal expert” mental model. It is clear that users don’t have an adequate model of how the VMS system works and that VMS planners have a model of user behaviour that does not correspond to the behaviour of the drivers we interviewed. Finally, Taxonomy Based Modeling is discussed as a tool for mental model remediation.  相似文献   

13.
The theory of the tight span, a cell complex that can be associated to every metric D, offers a unifying view on existing approaches for analyzing distance data, in particular for decomposing a metric D into a sum of simpler metrics as well as for representing it by certain specific edge-weighted graphs, often referred to as realizations of D. Many of these approaches involve the explicit or implicit computation of the so-called cutpoints of (the tight span of) D, such as the algorithm for computing the “building blocks” of optimal realizations of D recently presented by A. Hertz and S. Varone. The main result of this paper is an algorithm for computing the set of these cutpoints for a metric D on a finite set with n elements in O(n3) time. As a direct consequence, this improves the run time of the aforementioned O(n6)-algorithm by Hertz and Varone by “three orders of magnitude”.  相似文献   

14.
This short note develops some ideas along the lines of the stimulating paper by Heylighen (Found Sci 15 4(3):345–356, 2010a). It summarizes a theme in several writings with Francis Bailly, downloadable from this author’s web page. The “geometrization” of time and causality is the common ground of the analysis hinted here and in Heylighen’s paper. Heylighen adds a logical notion, consistency, in order to understand a possible origin of the selective process that may have originated this organization of natural phenomena. We will join our perspectives by hinting to some gnoseological complexes, common to mathematics and physics, which may shed light on the issues raised by Heylighen.  相似文献   

15.
A common practice in cross validation research in the behavioral sciences is to employ either the product moment correlation or a simple tabulation of first-choice “hits” for measuring the accuracy with which various preference models predict subjects’ responses to a holdout sample of choice objects. We propose a nonparametric approach for summarizing the accuracy of predicted rankings across a set of holdout-sample options. The methods that we develop contain a novel way to deal with ties and an approach to the different weighting of rank positions.  相似文献   

16.
Towards a Hierarchical Definition of Life,the Organism,and Death   总被引:3,自引:3,他引:0  
Despite hundreds of definitions, no consensus exists on a definition of life or on the closely related and problematic definitions of the organism and death. These problems retard practical and theoretical development in, for example, exobiology, artificial life, biology and evolution. This paper suggests improving this situation by basing definitions on a theory of a generalized particle hierarchy. This theory uses the common denominator of the “operator” for a unified ranking of both particles and organisms, from elementary particles to animals with brains. Accordingly, this ranking is called “the operator hierarchy”. This hierarchy allows life to be defined as: matter with the configuration of an operator, and that possesses a complexity equal to, or even higher than the cellular operator. Living is then synonymous with the dynamics of such operators and the word organism refers to a select group of operators that fit the definition of life. The minimum condition defining an organism is its existence as an operator, construction thus being more essential than metabolism, growth or reproduction. In the operator hierarchy, every organism is associated with a specific closure, for example, the nucleus in eukaryotes. This allows death to be defined as: the state in which an organism has lost its closure following irreversible deterioration of its organization. The generality of the operator hierarchy also offers a context to discuss “life as we do not know it”. The paper ends with testing the definition’s practical value with a range of examples.  相似文献   

17.
The aim of this contribution is to critically examine the metaphysical presuppositions that prevail in (Stewart in Found Sci 15(4):395–409, 2010a) answer to the question “are we in the midst of a developmental process?” as expressed in his statement “that humanity has discovered the trajectory of past evolution and can see how it is likely to continue in the future”.  相似文献   

18.
We consider two fundamental properties in the analysis of two-way tables of positive data: the principle of distributional equivalence, one of the cornerstones of correspondence analysis of contingency tables, and the principle of subcompositional coherence, which forms the basis of compositional data analysis. For an analysis to be subcompositionally coherent, it suffices to analyze the ratios of the data values. A common approach to dimension reduction in compositional data analysis is to perform principal component analysis on the logarithms of ratios, but this method does not obey the principle of distributional equivalence. We show that by introducing weights for the rows and columns, the method achieves this desirable property and can be applied to a wider class of methods. This weighted log-ratio analysis is theoretically equivalent to “spectral mapping”, a multivariate method developed almost 30 years ago for displaying ratio-scale data from biological activity spectra. The close relationship between spectral mapping and correspondence analysis is also explained, as well as their connection with association modeling. The weighted log-ratio methodology is used here to visualize frequency data in linguistics and chemical compositional data in archeology. The first author acknowledges research support from the Fundación BBVA in Madrid as well as partial support by the Spanish Ministry of Education and Science, grant MEC-SEJ2006-14098. The constructive comments of the referees, who also brought additional relevant literature to our attention, significantly improved our article.  相似文献   

19.
A mathematical programming algorithm is developed for fitting ultrametric or additive trees to proximity data where external constraints are imposed on the topology of the tree. The two procedures minimize a least squares loss function. The method is illustrated on both synthetic and real data. A constrained ultrametric tree analysis was performed on similarities between 32 subjects based on preferences for ten odors, while a constrained additive tree analysis was carried out on some proximity data between kinship terms. Finally, some extensions of the methodology to other tree fitting procedures are mentioned.The first author is supported as Bevoegdverklaard Navorser of the Belgian Nationaal Fonds voor Wetenschappelijk Onderzoek.  相似文献   

20.
In multivariate discrimination of several normal populations, the optimal classification procedure is based on quadratic discriminant functions. We compare expected error rates of the quadratic classification procedure if the covariance matrices are estimated under the following four models: (i) arbitrary covariance matrices, (ii) common principal components, (iii) proportional covariance matrices, and (iv) identical covariance matrices. Using Monte Carlo simulation to estimate expected error rates, we study the performance of the four discrimination procedures for five different parameter setups corresponding to standard situations that have been used in the literature. The procedures are examined for sample sizes ranging from 10 to 60, and for two to four groups. Our results quantify the extent to which a parsimonious method reduces error rates, and demonstrate that choosing a simple method of discrimination is often beneficial even if the underlying model assumptions are wrong.The authors wish to thank the editor and three referees for their helpful comments on the first draft of this article. M. J. Schmid supported by grants no. 2.724-0.85 and 2.038-0.86 of the Swiss National Science Foundation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号