共查询到20条相似文献,搜索用时 31 毫秒
1.
Robert B. Schneider 《Journal of Classification》1992,9(2):257-273
We present a method and an algorithm that puts interval and ordinal multidimensional scaling at two ends of a continuum. Theory
and simulation show that the method compares favorably with classical scaling methods. A parameter is identified that produces
scaling that combines benefits of interval, and ordinal scaling. 相似文献
2.
This paper develops a new procedure for simultaneously performing multidimensional scaling and cluster analysis on two-way
compositional data of proportions. The objective of the proposed procedure is to delineate patterns of variability in compositions
across subjects by simultaneously clustering subjects into latent classes or groups and estimating a joint space of stimulus
coordinates and class-specific vectors in a multidimensional space. We use a conditional mixture, maximum likelihood framework
with an E-M algorithm for parameter estimation. The proposed procedure is illustrated using a compositional data set reflecting
proportions of viewing time across television networks for an area sample of households. 相似文献
3.
An asymmetric multidimensional scaling model and an associated nonmetric algorithm to analyze two-mode three-way proximities (object × object × source) are introduced. The model consists of a common object configuration and two kinds of weights, i.e., for both symmetry and asymmetry. In the common object configuration, each object is represented by a point and a circle (sphere, hypersphere) in a Euclidean space. The common object configuration represents pairwise proximity relationships between pairs of objects for the ‘group’ of all sources. Each source has its own symmetry weight and a set of asymmetry weights. Symmetry weights represent individual differences among sources of data in symmetric proximity relationships, and asymmetry weights represent individual differences among sources in asymmetric proximity relationships. The associated nonmetric algorithm, based on Kruskal’s (1964b) nonmetric multidimensional scaling algorithm, is an extension of the algorithm for the asymmetric multidimensional scaling of one mode two-way proximities developed earlier (Okada and Imaizumi 1987). As an illustrative example, we analyze intergenerational occupational mobility from 1955 to 1985 in Japan among eight occupational categories. 相似文献
4.
Glenn W. Milligan 《Journal of Classification》1989,6(1):53-71
De Soete (1986, 1988) proposed a variable weighting procedure when Euclidean distance is used as the dissimilarity measure with an ultrametric hierarchical clustering method. The algorithm produces weighted distances which approximate ultrametric distances as closely as possible in a least squares sense. The present simulation study examined the effectiveness of the De Soete procedure for an applications problem for which it was not originally intended. That is, to determine whether or not the algorithm can be used to reduce the influence of variables which are irrelevant to the clustering present in the data. The simulation study examined the ability of the procedure to recover a variety of known underlying cluster structures. The results indicate that the algorithm is effective in identifying extraneous variables which do not contribute information about the true cluster structure. Weights near 0.0 were typically assigned to such extraneous variables. Furthermore, the variable weighting procedure was not adversely effected by the presence of other forms of error in the data. In general, it is recommended that the variable weighting procedure be used for applied analyses when Euclidean distance is employed with ultrametric hierarchical clustering methods. 相似文献
5.
k . In this procedure, a least-squares loss function in terms of discrepancies between D and M is minimized. The present paper
describes the original hierarchical classes algorithm proposed by De Boeck and Rosenberg (1988), which is based on an alternating
greedy heuristic, and proposes a new algorithm, based on an alternating branch-and-bound procedure. An extensive simulation
study is reported in which both algorithms are evaluated and compared according to goodness-of-fit to the data and goodness-of-recovery
of the underlying true structure. Furthermore, three heuristics for selecting models of different ranks for a given D are
presented and compared. The simulation results show that the new algorithm yields models with slightly higher goodness-of-fit
and goodness-of-recovery values. 相似文献
6.
The DINA model is a commonly used model for obtaining diagnostic information. Like many other Diagnostic Classification Models (DCMs), it can require a large sample size to obtain reliable item and examinee parameter estimation. Neural Network (NN) analysis is a classification method that uses a training dataset for calibration. As a result, if this training dataset is determined theoretically, as was the case in Gierl’s attribute hierarchical method (AHM), the NN analysis does not have any sample size requirements. However, a NN approach does not provide traditional item parameters of a DCM or allow for item responses to influence test calibration. In this paper, the NN approach will be implemented for the DINA model estimation to explore its effectiveness as a classification method beyond its use in AHM. The accuracy of the NN approach across different sample sizes, item quality and Q-matrix complexity is described in the DINA model context. Then, a Markov Chain Monte Carlo (MCMC) estimation algorithm and Joint Maximum Likelihood Estimation is used to extend the NN approach so that item parameters associated with the DINA model are obtained while allowing examinee responses to influence the test calibration. The results derived by the NN, the combination of MCMC and NN (NN MCMC) and the combination of JMLE and NN are compared with that of the well-established Hierarchical MCMC procedure and JMLE with a uniform prior on the attribute profile to illustrate their strength and weakness. 相似文献
7.
Global Optimization in Any Minkowski Metric: A Permutation-Translation Simulated Annealing Algorithm for Multidimensional Scaling 总被引:3,自引:1,他引:2
It is well known that considering a non-Euclidean Minkowski metric in Multidimensional Scaling, either for the distance model
or for the loss function, increases the computational problem of local minima considerably. In this paper, we propose an algorithm
in which both the loss function and the composition rule can be considered in any Minkowski metric, using a multivariate randomly
alternating Simulated Annealing procedure with permutation and translation phases. The algorithm has been implemented in Fortran
and tested over classical and simulated data matrices with sizes up to 200 objects. A study has been carried out with some
of the common loss functions to determine the most suitable values for the main parameters. The experimental results confirm
the theoretical expectation that Simulated Annealing is a suitable strategy to deal by itself with the optimization problems
in Multidimensional Scaling, in particular for City-Block, Euclidean and Infinity metrics. 相似文献
8.
Carroll and Chang have derived the symmetric CANDECOMP model from the INDSCAL model, to fit symmetric matrices of approximate
scalar products in the least squares sense. Typically, the CANDECOMP algorithm is used to estimate the parameters. In the
present paper it is shown that negative weights may occur with CANDECOMP. This phenomenon can be suppressed by updating the
weights by the Nonnegative Least Squares Algorithm. A potential drawback of the resulting procedure is that it may produce
two different versions of the stimulus space matrix. To obviate this possibility, a symmetry preserving algorithm is offered,
which can be monitored to produce non-negative weights as well.
This work was partially supported by the Royal Netherlands Academy of Arts and Sciences. 相似文献
9.
Michael J. Brusco 《Journal of Classification》2006,23(2):255-268
In this research note, I present a modified version of G. De Soete, L. Hubert, and P. Arabie’s (1988) simulated annealing
approach for the problem of L2 unidimensional scaling via maximization of the Defays criterion. The modifications include efficient storage and computation
methods that facilitate rapid evaluation of trial solutions. The results of two experimental studies indicate that the enhanced
simulated annealing algorithm is competitive with A. Murillo, J.F. Vera, and W.J. Heiser’s (2005) recently published pertsaus2
procedure in terms of solution quality and computation time. Both Fortran and MatLab versions of this modified simulated annealing
implementation are available from the author. 相似文献
10.
The aim of this paper is to analyze two scaling extensions of the Orthogonal Procrustes Problem (OPP) called the pre-scaling
and the post-scaling approaches. We also discuss some problems related to these extensions and propose two new algorithms
to find optimal solutions. These algorithms, which are based on the majorization principle, are shown to be monotonically
convergent and their performance is examined. 相似文献
11.
Consider N entities to be classified (e.g., geographical areas), a matrix of dissimilarities
between pairs of entities, a graph H with vertices associated with these entities such that the edges join the vertices corresponding to contiguous entities. The split of a
cluster is the smallest dissimilarity between an entity of this cluster and an entity outside of
it. The single-linkage algorithm (ignoring contiguity between entities) provides partitions into M clusters for which the smallest split of the clusters, called split of the partition, is
maximum. We study here the partitioning of the set of entities into M connected clusters
for all M between N - 1 and 2 (i.e., clusters such that the subgraphs of H induced by their
corresponding sets of entities are connected) with maximum split subject to that condition.
We first provide an exact algorithm with a (N2) complexity for the particular case in which H is a tree. This algorithm suggests in turn a first heuristic algorithm for the general problem. Several variants of this heuristic are Also explored. We then present an exact
algorithm for the general case based on iterative determination of cocycles of subtrees and on the solution of auxiliary set covering problems. As solution of the latter problems is
time-consuming for large instances, we provide another heuristic in which the auxiliary
set covering problems are solved approximately. Computational results obtained with the
exact and heuristic algorithms are presented on test problems from the literature. 相似文献
12.
Henk A. L. Kiers 《Journal of Classification》1997,14(2):297-310
The SINDCLUS algorithm for fitting the ADCLUS and INDCLUS models deals with a parameter matrix that occurs twice in the model by considering the two occurrences as independent parameter matrices. This procedure has been justified empirically by the observation that upon convergence of the algorithm to the global optimum, the two independently treated parameter matrices turn out to be equal. In the present paper, results are presented that contradict this finding, and a modification of SINDCLUS is presented which obviates the need for independently treating two occurrences of the same parameter matrix. 相似文献
13.
Jun-Young Oh 《Foundations of Science》2014,19(2):153-174
This study aims to understand scientific inference for the evolutionary procedure of Continental Drift based on abductive inference, which is important for creative inference and scientific discovery during problem solving. We present the following two research problems: (1) we suggest a scientific inference procedure as well as various strategies and a criterion for choosing hypotheses over other competing or previous hypotheses; aspects of this procedure include puzzling observation, abduction, retroduction, updating, deduction, induction, and recycle; and (2) we analyze the “theory of continental drift” discovery, called the Earth science revolution, using our multistage inference procedure. Wegener’s Continental Drift hypothesis had an impact comparable to the revolution caused by Darwin’s theory of evolution in biology. Finally, the suggested inquiry inference model can provide us with a more consistent view of science and promote a deeper understanding of scientific concepts. 相似文献
14.
We construct a weighted Euclidean distance that approximates any distance or dissimilarity measure between individuals that is based on a rectangular cases-by-variables data matrix. In contrast to regular multidimensional scaling methods for dissimilarity data, our approach leads to biplots of individuals and variables while preserving all the good properties of dimension-reduction methods that are based on the singular-value decomposition. The main benefits are the decomposition of variance into components along principal axes, which provide the numerical diagnostics known as contributions, and the estimation of nonnegative weights for each variable. The idea is inspired by the distance functions used in correspondence analysis and in principal component analysis of standardized data, where the normalizations inherent in the distances can be considered as differential weighting of the variables. In weighted Euclidean biplots, we allow these weights to be unknown parameters, which are estimated from the data to maximize the fit to the chosen distances or dissimilarities. These weights are estimated using a majorization algorithm. Once this extra weight-estimation step is accomplished, the procedure follows the classical path in decomposing the matrix and displaying its rows and columns in biplots. 相似文献
15.
Incremental Classification with Generalized Eigenvalues 总被引:2,自引:0,他引:2
Claudio Cifarelli Mario R. Guarracino Onur Seref Salvatore Cuciniello Panos M. Pardalos 《Journal of Classification》2007,24(2):205-219
Supervised learning techniques are widely accepted methods to analyze data for scientific and real world problems. Most of
these problems require fast and continuous acquisition of data, which are to be used in training the learning system. Therefore,
maintaining such systems updated may become cumbersome. Various techniques have been devised in the field of machine learning
to solve this problem. In this study, we propose an algorithm to reduce the training data to a substantially small subset
of the original training data to train a generalized eigenvalue classifier. The proposed method provides a constructive way
to understand the influence of new training data on an existing classification function. We show through numerical experiments
that this technique prevents the overfitting problem of the earlier generalized eigenvalue classifiers, while promising a
comparable performance in classification with respect to the state-of-the-art classification methods. 相似文献
16.
We present an approach, independent of the common gradient-based necessary conditions for obtaining a (locally) optimal solution,
to multidimensional scaling using the city-block distance function, and implementable in either a metric or nonmetric context.
The difficulties encountered in relying on a gradient-based strategy are first reviewed: the general weakness in indicating
a good solution that is implied by the satisfaction of the necessary condition of a zero gradient, and the possibility of
actual nonconvergence of the associated optimization strategy. To avoid the dependence on gradients for guiding the optimization
technique, an alternative iterative procedure is proposed that incorporates (a) combinatorial optimization to construct good
object orders along the chosen number of dimensions and (b) nonnegative least-squares to re-estimate the coordinates for the
objects based on the object orders. The re-estimated coordinates are used to improve upon the given object orders, which may
in turn lead to better coordinates, and so on until convergence of the entire process occurs to a (locally) optimal solution.
The approach is illustrated through several data sets on the perception of similarity of rectangles and compared to the results
obtained with a gradient-based method. 相似文献
17.
Michael J. Brusco 《Journal of Classification》2002,19(1):45-67
L
1-norm are also presented. I conclude that the
computational scaling problems depends largely on the criterion of interest,
with unidimensional scaling problems depends largely on the criterion of
interest, with unidimensional scaling in the L
1-norm being
especially challenging. 相似文献
18.
The majorization method for multidimensional scaling with Kruskal's STRESS has been limited to Euclidean distances only. Here we extend the majorization algorithm to deal with Minkowski distances with 1≤p≤2 and suggest an algorithm that is partially based on majorization forp outside this range. We give some convergence proofs and extend the zero distance theorem of De Leeuw (1984) to Minkowski distances withp>1. 相似文献
19.
Mihaela D. Iftime 《Foundations of Science》2011,16(4):353-361
Ever wonder if it is possible to construct a numeric scale for environmental variables, like one does for the temperature? This paper is an attempt to construct one. There are two main parts: section “Statistical Analysis of Variations” presents a general statistical strategy for environmental factor selection. Section “Nonlinear Analytical Geometric Model of Variations” develops an analytical geometric representation of system variations in response to environmental changes. The model is used to quantify the effects of environmental interactions. The paper treats only one-dimensional case, however, the derivation of the case of multiple independent factors follows immediately. The general method developed in this paper may prove applicable to many different fields, such as extensions beyond classical physics, economics, and other sciences. Section “Conclusion” provides an illustration of applications, examples and implications of the results. 相似文献
20.
Stephen L. Bieber 《Journal of Classification》1986,3(1):113-134
A procedure is presented which permits the analysis of factor analytic problems in which several groups exist. The analysis incorporates a hierarchical scheme of searching for factorial invariance and is an extension of Meredith's (1964) Method One procedure. By overlaying a contextual frame of reference on a traditional factor analysis solution, it is possible to use this technique to examine structural similarity and dissimilarity between groups. The procedure is exhibited in an example and in addition a comparison is made to discriminant analysis. 相似文献