首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We present a method and an algorithm that puts interval and ordinal multidimensional scaling at two ends of a continuum. Theory and simulation show that the method compares favorably with classical scaling methods. A parameter is identified that produces scaling that combines benefits of interval, and ordinal scaling.  相似文献   

2.
This paper develops a new procedure for simultaneously performing multidimensional scaling and cluster analysis on two-way compositional data of proportions. The objective of the proposed procedure is to delineate patterns of variability in compositions across subjects by simultaneously clustering subjects into latent classes or groups and estimating a joint space of stimulus coordinates and class-specific vectors in a multidimensional space. We use a conditional mixture, maximum likelihood framework with an E-M algorithm for parameter estimation. The proposed procedure is illustrated using a compositional data set reflecting proportions of viewing time across television networks for an area sample of households.  相似文献   

3.
An asymmetric multidimensional scaling model and an associated nonmetric algorithm to analyze two-mode three-way proximities (object × object × source) are introduced. The model consists of a common object configuration and two kinds of weights, i.e., for both symmetry and asymmetry. In the common object configuration, each object is represented by a point and a circle (sphere, hypersphere) in a Euclidean space. The common object configuration represents pairwise proximity relationships between pairs of objects for the ‘group’ of all sources. Each source has its own symmetry weight and a set of asymmetry weights. Symmetry weights represent individual differences among sources of data in symmetric proximity relationships, and asymmetry weights represent individual differences among sources in asymmetric proximity relationships. The associated nonmetric algorithm, based on Kruskal’s (1964b) nonmetric multidimensional scaling algorithm, is an extension of the algorithm for the asymmetric multidimensional scaling of one mode two-way proximities developed earlier (Okada and Imaizumi 1987). As an illustrative example, we analyze intergenerational occupational mobility from 1955 to 1985 in Japan among eight occupational categories.  相似文献   

4.
A validation study of a variable weighting algorithm for cluster analysis   总被引:1,自引:0,他引:1  
De Soete (1986, 1988) proposed a variable weighting procedure when Euclidean distance is used as the dissimilarity measure with an ultrametric hierarchical clustering method. The algorithm produces weighted distances which approximate ultrametric distances as closely as possible in a least squares sense. The present simulation study examined the effectiveness of the De Soete procedure for an applications problem for which it was not originally intended. That is, to determine whether or not the algorithm can be used to reduce the influence of variables which are irrelevant to the clustering present in the data. The simulation study examined the ability of the procedure to recover a variety of known underlying cluster structures. The results indicate that the algorithm is effective in identifying extraneous variables which do not contribute information about the true cluster structure. Weights near 0.0 were typically assigned to such extraneous variables. Furthermore, the variable weighting procedure was not adversely effected by the presence of other forms of error in the data. In general, it is recommended that the variable weighting procedure be used for applied analyses when Euclidean distance is employed with ultrametric hierarchical clustering methods.  相似文献   

5.
k  . In this procedure, a least-squares loss function in terms of discrepancies between D and M is minimized. The present paper describes the original hierarchical classes algorithm proposed by De Boeck and Rosenberg (1988), which is based on an alternating greedy heuristic, and proposes a new algorithm, based on an alternating branch-and-bound procedure. An extensive simulation study is reported in which both algorithms are evaluated and compared according to goodness-of-fit to the data and goodness-of-recovery of the underlying true structure. Furthermore, three heuristics for selecting models of different ranks for a given D are presented and compared. The simulation results show that the new algorithm yields models with slightly higher goodness-of-fit and goodness-of-recovery values.  相似文献   

6.
The DINA model is a commonly used model for obtaining diagnostic information. Like many other Diagnostic Classification Models (DCMs), it can require a large sample size to obtain reliable item and examinee parameter estimation. Neural Network (NN) analysis is a classification method that uses a training dataset for calibration. As a result, if this training dataset is determined theoretically, as was the case in Gierl’s attribute hierarchical method (AHM), the NN analysis does not have any sample size requirements. However, a NN approach does not provide traditional item parameters of a DCM or allow for item responses to influence test calibration. In this paper, the NN approach will be implemented for the DINA model estimation to explore its effectiveness as a classification method beyond its use in AHM. The accuracy of the NN approach across different sample sizes, item quality and Q-matrix complexity is described in the DINA model context. Then, a Markov Chain Monte Carlo (MCMC) estimation algorithm and Joint Maximum Likelihood Estimation is used to extend the NN approach so that item parameters associated with the DINA model are obtained while allowing examinee responses to influence the test calibration. The results derived by the NN, the combination of MCMC and NN (NN MCMC) and the combination of JMLE and NN are compared with that of the well-established Hierarchical MCMC procedure and JMLE with a uniform prior on the attribute profile to illustrate their strength and weakness.  相似文献   

7.
It is well known that considering a non-Euclidean Minkowski metric in Multidimensional Scaling, either for the distance model or for the loss function, increases the computational problem of local minima considerably. In this paper, we propose an algorithm in which both the loss function and the composition rule can be considered in any Minkowski metric, using a multivariate randomly alternating Simulated Annealing procedure with permutation and translation phases. The algorithm has been implemented in Fortran and tested over classical and simulated data matrices with sizes up to 200 objects. A study has been carried out with some of the common loss functions to determine the most suitable values for the main parameters. The experimental results confirm the theoretical expectation that Simulated Annealing is a suitable strategy to deal by itself with the optimization problems in Multidimensional Scaling, in particular for City-Block, Euclidean and Infinity metrics.  相似文献   

8.
Carroll and Chang have derived the symmetric CANDECOMP model from the INDSCAL model, to fit symmetric matrices of approximate scalar products in the least squares sense. Typically, the CANDECOMP algorithm is used to estimate the parameters. In the present paper it is shown that negative weights may occur with CANDECOMP. This phenomenon can be suppressed by updating the weights by the Nonnegative Least Squares Algorithm. A potential drawback of the resulting procedure is that it may produce two different versions of the stimulus space matrix. To obviate this possibility, a symmetry preserving algorithm is offered, which can be monitored to produce non-negative weights as well. This work was partially supported by the Royal Netherlands Academy of Arts and Sciences.  相似文献   

9.
In this research note, I present a modified version of G. De Soete, L. Hubert, and P. Arabie’s (1988) simulated annealing approach for the problem of L2 unidimensional scaling via maximization of the Defays criterion. The modifications include efficient storage and computation methods that facilitate rapid evaluation of trial solutions. The results of two experimental studies indicate that the enhanced simulated annealing algorithm is competitive with A. Murillo, J.F. Vera, and W.J. Heiser’s (2005) recently published pertsaus2 procedure in terms of solution quality and computation time. Both Fortran and MatLab versions of this modified simulated annealing implementation are available from the author.  相似文献   

10.
The aim of this paper is to analyze two scaling extensions of the Orthogonal Procrustes Problem (OPP) called the pre-scaling and the post-scaling approaches. We also discuss some problems related to these extensions and propose two new algorithms to find optimal solutions. These algorithms, which are based on the majorization principle, are shown to be monotonically convergent and their performance is examined.  相似文献   

11.
Consider N entities to be classified (e.g., geographical areas), a matrix of dissimilarities between pairs of entities, a graph H with vertices associated with these entities such that the edges join the vertices corresponding to contiguous entities. The split of a cluster is the smallest dissimilarity between an entity of this cluster and an entity outside of it. The single-linkage algorithm (ignoring contiguity between entities) provides partitions into M clusters for which the smallest split of the clusters, called split of the partition, is maximum. We study here the partitioning of the set of entities into M connected clusters for all M between N - 1 and 2 (i.e., clusters such that the subgraphs of H induced by their corresponding sets of entities are connected) with maximum split subject to that condition. We first provide an exact algorithm with a (N2) complexity for the particular case in which H is a tree. This algorithm suggests in turn a first heuristic algorithm for the general problem. Several variants of this heuristic are Also explored. We then present an exact algorithm for the general case based on iterative determination of cocycles of subtrees and on the solution of auxiliary set covering problems. As solution of the latter problems is time-consuming for large instances, we provide another heuristic in which the auxiliary set covering problems are solved approximately. Computational results obtained with the exact and heuristic algorithms are presented on test problems from the literature.  相似文献   

12.
The SINDCLUS algorithm for fitting the ADCLUS and INDCLUS models deals with a parameter matrix that occurs twice in the model by considering the two occurrences as independent parameter matrices. This procedure has been justified empirically by the observation that upon convergence of the algorithm to the global optimum, the two independently treated parameter matrices turn out to be equal. In the present paper, results are presented that contradict this finding, and a modification of SINDCLUS is presented which obviates the need for independently treating two occurrences of the same parameter matrix.  相似文献   

13.
This study aims to understand scientific inference for the evolutionary procedure of Continental Drift based on abductive inference, which is important for creative inference and scientific discovery during problem solving. We present the following two research problems: (1) we suggest a scientific inference procedure as well as various strategies and a criterion for choosing hypotheses over other competing or previous hypotheses; aspects of this procedure include puzzling observation, abduction, retroduction, updating, deduction, induction, and recycle; and (2) we analyze the “theory of continental drift” discovery, called the Earth science revolution, using our multistage inference procedure. Wegener’s Continental Drift hypothesis had an impact comparable to the revolution caused by Darwin’s theory of evolution in biology. Finally, the suggested inquiry inference model can provide us with a more consistent view of science and promote a deeper understanding of scientific concepts.  相似文献   

14.
We construct a weighted Euclidean distance that approximates any distance or dissimilarity measure between individuals that is based on a rectangular cases-by-variables data matrix. In contrast to regular multidimensional scaling methods for dissimilarity data, our approach leads to biplots of individuals and variables while preserving all the good properties of dimension-reduction methods that are based on the singular-value decomposition. The main benefits are the decomposition of variance into components along principal axes, which provide the numerical diagnostics known as contributions, and the estimation of nonnegative weights for each variable. The idea is inspired by the distance functions used in correspondence analysis and in principal component analysis of standardized data, where the normalizations inherent in the distances can be considered as differential weighting of the variables. In weighted Euclidean biplots, we allow these weights to be unknown parameters, which are estimated from the data to maximize the fit to the chosen distances or dissimilarities. These weights are estimated using a majorization algorithm. Once this extra weight-estimation step is accomplished, the procedure follows the classical path in decomposing the matrix and displaying its rows and columns in biplots.  相似文献   

15.
Incremental Classification with Generalized Eigenvalues   总被引:2,自引:0,他引:2  
Supervised learning techniques are widely accepted methods to analyze data for scientific and real world problems. Most of these problems require fast and continuous acquisition of data, which are to be used in training the learning system. Therefore, maintaining such systems updated may become cumbersome. Various techniques have been devised in the field of machine learning to solve this problem. In this study, we propose an algorithm to reduce the training data to a substantially small subset of the original training data to train a generalized eigenvalue classifier. The proposed method provides a constructive way to understand the influence of new training data on an existing classification function. We show through numerical experiments that this technique prevents the overfitting problem of the earlier generalized eigenvalue classifiers, while promising a comparable performance in classification with respect to the state-of-the-art classification methods.  相似文献   

16.
Multidimensional scaling in the city-block metric: A combinatorial approach   总被引:1,自引:1,他引:0  
We present an approach, independent of the common gradient-based necessary conditions for obtaining a (locally) optimal solution, to multidimensional scaling using the city-block distance function, and implementable in either a metric or nonmetric context. The difficulties encountered in relying on a gradient-based strategy are first reviewed: the general weakness in indicating a good solution that is implied by the satisfaction of the necessary condition of a zero gradient, and the possibility of actual nonconvergence of the associated optimization strategy. To avoid the dependence on gradients for guiding the optimization technique, an alternative iterative procedure is proposed that incorporates (a) combinatorial optimization to construct good object orders along the chosen number of dimensions and (b) nonnegative least-squares to re-estimate the coordinates for the objects based on the object orders. The re-estimated coordinates are used to improve upon the given object orders, which may in turn lead to better coordinates, and so on until convergence of the entire process occurs to a (locally) optimal solution. The approach is illustrated through several data sets on the perception of similarity of rectangles and compared to the results obtained with a gradient-based method.  相似文献   

17.
L 1-norm are also presented. I conclude that the computational scaling problems depends largely on the criterion of interest, with unidimensional scaling problems depends largely on the criterion of interest, with unidimensional scaling in the L 1-norm being especially challenging.  相似文献   

18.
The majorization method for multidimensional scaling with Kruskal's STRESS has been limited to Euclidean distances only. Here we extend the majorization algorithm to deal with Minkowski distances with 1≤p≤2 and suggest an algorithm that is partially based on majorization forp outside this range. We give some convergence proofs and extend the zero distance theorem of De Leeuw (1984) to Minkowski distances withp>1.  相似文献   

19.
Ever wonder if it is possible to construct a numeric scale for environmental variables, like one does for the temperature? This paper is an attempt to construct one. There are two main parts: section “Statistical Analysis of Variations” presents a general statistical strategy for environmental factor selection. Section “Nonlinear Analytical Geometric Model of Variations” develops an analytical geometric representation of system variations in response to environmental changes. The model is used to quantify the effects of environmental interactions. The paper treats only one-dimensional case, however, the derivation of the case of multiple independent factors follows immediately. The general method developed in this paper may prove applicable to many different fields, such as extensions beyond classical physics, economics, and other sciences. Section “Conclusion” provides an illustration of applications, examples and implications of the results.  相似文献   

20.
A procedure is presented which permits the analysis of factor analytic problems in which several groups exist. The analysis incorporates a hierarchical scheme of searching for factorial invariance and is an extension of Meredith's (1964) Method One procedure. By overlaying a contextual frame of reference on a traditional factor analysis solution, it is possible to use this technique to examine structural similarity and dissimilarity between groups. The procedure is exhibited in an example and in addition a comparison is made to discriminant analysis.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号