共查询到20条相似文献,搜索用时 31 毫秒
1.
Classification Using Class Cover Catch Digraphs 总被引:2,自引:0,他引:2
Carey E. Priebe David J. Marchette Jason G. DeVinney Diego A. Socolinsky 《Journal of Classification》2003,20(1):003-023
class cover catch
digraphs based on proximity between training observations. Performance
comparisons are presented on synthetic and real examples versus
k-nearest neighbors, Fisher's linear discriminant and support vector
machines. We demonstrate that the proposed semiparametric classifier has
performance approaching that of the optimal parametric classifier in cases
for which the optimal is available for comparison. 相似文献
2.
In this paper we show how biplot methodology can be combined with
various forms of discriminant analyses leading to highly informative visual displays of
the respective class separations. It is demonstrated that the concept of distance as
applied to discriminant analysis provides a unified approach to a wide variety of
discriminant analysis procedures that can be accommodated by just changing to an
appropriate distance metric. These changes in the distance metric are crucial for the
construction of appropriate biplots. Several new types of biplots viz. quadratic
discriminant analysis biplots for use with heteroscedastic stratified data, discriminant
subspace biplots and flexible discriminant analysis biplots are derived and their use
illustrated. Advantages of the proposed procedures are pointed out. Although biplot
methodology is in particular well suited for complementing J > 2 classes discrimination
problems its use in 2-class problems is also illustrated. 相似文献
3.
Manuel Escabias Ana M. Aguilera M. Carmen Aguilera-Morillo 《Journal of Classification》2014,31(3):296-324
In many statistical applications data are curves measured as functions of a continuous parameter as time. Despite of their functional nature and due to discrete-time observation, these type of data are usually analyzed with multivariate statistical methods that do not take into account the high correlation between observations of a single curve at nearby time points. Functional data analysis methodologies have been developed to solve these type of problems. In order to predict the class membership (multi-category response variable) associated to an observed curve (functional data), a functional generalized logit model is proposed. Base-line category logit formulations will be considered and their estimation based on basis expansions of the sample curves of the functional predictor and parameters. Functional principal component analysis will be used to get an accurate estimation of the functional parameters and to classify sample curves in the categories of the response variable. The good performance of the proposed methodology will be studied by developing an experimental study with simulated and real data. 相似文献
4.
A computationally efficient approximation to the nearest neighbor interchange metric 总被引:3,自引:3,他引:0
The nearest neighbor interchange (nni) metric is a distance measure providing a quantitative measure of dissimilarity between two unrooted binary trees with labeled leaves. The metric has a transparent definition in terms of a simple transformation of binary trees, but its use in nontrivial problems is usually prevented by the absence of a computationally efficient algorithm. Since recent attempts to discover such an algorithm continue to be unsuccessful, we address the complementary problem of designing an approximation to the nni metric. Such an approximation should be well-defined, efficient to compute, comprehensible to users, relevant to applications, and a close fit to the nni metric; the challenge, of course, is to compromise these objectives in such a way that the final design is acceptable to users with practical and theoretical orientations. We describe an approximation algorithm that appears to satisfy adequately these objectives. The algorithm requires O(n) space to compute dissimilarity between binary trees withn labeled leaves; it requires O(n logn) time for rooted trees and O(n
2 logn) time for unrooted trees. To help the user interpret the dissimilarity measures based on this algorithm, we describe empirical distributions of dissimilarities between pairs of randomly selected trees for both rooted and unrooted cases.The Natural Sciences and Engineering Research Council of Canada partially supported this work with Grant A-4142. 相似文献
5.
Recognizing the successes of treed Gaussian process (TGP) models as an interpretable and thrifty model for nonparametric regression,
we seek to extend the model to classification. Both treed models and Gaussian processes (GPs) have, separately, enjoyed great
success in application to classification problems. An example of the former is Bayesian CART. In the latter, real-valued GP
output may be utilized for classification via latent variables, which provide classification rules by means of a softmax function.
We formulate a Bayesian model averaging scheme to combine these two models and describe a Monte Carlo method for sampling
from the full posterior distribution with joint proposals for the tree topology and the GP parameters corresponding to latent variables at the leaves. We concentrate on efficient sampling of the latent variables,
which is important to obtain good mixing in the expanded parameter space. The tree structure is particularly helpful for this
task and also for developing an efficient scheme for handling categorical predictors, which commonly arise in classification
problems. Our proposed classification TGP (CTGP) methodology is illustrated on a collection of synthetic and real data sets.
We assess performance relative to existing methods and thereby show how CTGP is highly flexible, offers tractable inference,
produces rules that are easy to interpret, and performs well out of sample. 相似文献
6.
Charles Bouveyron 《Journal of Classification》2014,31(1):49-84
In supervised learning, an important issue usually not taken into account by classical methods is that a class represented in the test set may have not been encountered earlier in the learning phase. Classical supervised algorithms will automatically label such observations as belonging to one of the known classes in the training set and will not be able to detect new classes. This work introduces a model-based discriminant analysis method, called adaptive mixture discriminant analysis (AMDA), which can detect several unobserved groups of points and can adapt the learned classifier to the new situation. Two EM-based procedures are proposed for parameter estimation and model selection criteria are used for selecting the actual number of classes. Experiments on artificial and real data demonstrate the ability of the proposed method to deal with complex and real-world problems. The proposed approach is also applied to the detection of unobserved communities in social network analysis. 相似文献
7.
In multivariate discrimination of several normal populations, the optimal classification procedure is based on quadratic discriminant functions. We compare expected error rates of the quadratic classification procedure if the covariance matrices are estimated under the following four models: (i) arbitrary covariance matrices, (ii) common principal components, (iii) proportional covariance matrices, and (iv) identical covariance matrices. Using Monte Carlo simulation to estimate expected error rates, we study the performance of the four discrimination procedures for five different parameter setups corresponding to standard situations that have been used in the literature. The procedures are examined for sample sizes ranging from 10 to 60, and for two to four groups. Our results quantify the extent to which a parsimonious method reduces error rates, and demonstrate that choosing a simple method of discrimination is often beneficial even if the underlying model assumptions are wrong.The authors wish to thank the editor and three referees for their helpful comments on the first draft of this article. M. J. Schmid supported by grants no. 2.724-0.85 and 2.038-0.86 of the Swiss National Science Foundation. 相似文献
8.
We propose a new nonparametric family of oscillation heuristics for improving
linear classifiers in the two-group discriminant problem. The heuristics are motivated by
the intuition that the classification accuracy of a separating hyperplane can be improved
through small perturbations to its slope and position, accomplished by substituting training
observations near the hyperplane for those used to generate it. In an extensive simulation
study, using data generated from multivariate normal distributions under a variety of conditions,
the oscillation heuristics consistently improve upon the classical linear and logistic
discriminant functions, as well as two published linear programming-based heuristics and
a linear Support Vector Machine. Added to any of the methods above, they approach, and
frequently attain, the best possible accuracy on the training samples, as determined by a
mixed-integer programming (MIP) model, at a much smaller computational cost. They
also improve expected accuracy on the overall populations when the populations overlap
significantly and the heuristics are trained with large samples, at least in situations where
the data conditions do not explicitly favor a particular classifier. 相似文献
9.
In compositional data analysis, an observation is a vector containing nonnegative values, only the relative sizes of which are considered to be of interest. Without loss of generality, a compositional vector can be taken to be a vector of proportions that sum to one. Data of this type arise in many areas including geology, archaeology, biology, economics and political science. In this paper we investigate methods for classification of compositional data. Our approach centers on the idea of using the α-transformation to transform the data and then to classify the transformed data via regularized discriminant analysis and the k-nearest neighbors algorithm. Using the α-transformation generalizes two rival approaches in compositional data analysis, one (when α=1) that treats the data as though they were Euclidean, ignoring the compositional constraint, and another (when α = 0) that employs Aitchison’s centered log-ratio transformation. A numerical study with several real datasets shows that whether using α = 1 or α = 0 gives better classification performance depends on the dataset, and moreover that using an intermediate value of α can sometimes give better performance than using either 1 or 0. 相似文献
10.
ConsiderN entities to be classified, with given weights, and a matrix of dissimilarities between pairs of them. The split of a cluster is the smallest dissimilarity between an entity in that cluster and an entity outside it. The single-linkage algorithm provides partitions intoM clusters for which the smallest split is maximum. We consider the problems of finding maximum split partitions with exactlyM clusters and with at mostM clusters subject to the additional constraint that the sum of the weights of the entities in each cluster never exceeds a given bound. These two problems are shown to be NP-hard and reducible to a sequence of bin-packing problems. A (N
2) algorithm for the particular caseM =N of the second problem is also presented. Computational experience is reported.Acknowledgments: Work of the first author was supported in part by AFOSR grants 0271 and 0066 to Rutgers University and was done in part during a visit to GERAD, Ecole Polytechnique de Montréal, whose support is gratefully acknowledged. Work of the second and third authors was supported by NSERC grant GP0036426 and by FCAR grant 89EQ4144. We are grateful to Silvano Martello and Paolo Toth for making available to us their program MTP for the bin-paking problem and to three anonymous referees for comments which helped to improve the presentation of the paper. 相似文献
11.
This paper presents a conditional mixture, maximum likelihood methodology for performing clusterwise linear regression. This new methodology simultaneously estimates separate regression functions and membership inK clusters or groups. A review of related procedures is discussed with an associated critique. The conditional mixture, maximum likelihood methodology is introduced together with the E-M algorithm utilized for parameter estimation. A Monte Carlo analysis is performed via a fractional factorial design to examine the performance of the procedure. Next, a marketing application is presented concerning the evaluations of trade show performance by senior marketing executives. Finally, other potential applications and directions for future research are identified. 相似文献
12.
Yves Bouchard 《Foundations of Science》2007,12(4):325-336
In this paper, I show the complementarity of foundationalism and coherentism with respect to any efficient system of beliefs
by means of a distinction between two types of proposition drawn from an analogy with an axiomatic system. This distinction
is based on the way a given proposition is acknowledged as true, either by declaration (F-proposition) or by preservation
(C-proposition). Within such a perspective, i.e., epistemological complementarism, not only can one see how the usual opposition
between foundationalism and coherentism is irrelevant, but furthermore one can appreciate the reciprocal relation between
these two theories as they refer to two separate epistemological functions involved in the dynamics of constituting and expanding
an epistemic system.
相似文献
Yves BouchardEmail: |
13.
Ryszard Wójcicki 《Foundations of Science》1995,1(4):471-516
This paper was written with two aims in mind. A large part of it is just an exposition of Tarski's theory of truth. Philosophers do not agree on how Tarski's theory is related to their investigations. Some of them doubt whether that theory has any relevance to philosophical issues and in particular whether it can be applied in dealing with the problems of philosophy (theory) of science.In this paper I argue that Tarski's chief concern was the following question. Suppose a language L belongs to the class of languages for which, in full accordance with some formal conditions set in advance, we are able to define the class of all the semantic interpretations the language may acquire. Every interpretation of L can be viewed as a certain structure to which the expressions of the language may refer. Suppose that a specific interpretation of the language L was singled out as the intended one. Suppose, moreover, that the intended interpretation can be characterized in a metalanguage L
+. If the above assumptions are satisfied, can the notion of truth for L be defined in the metalanguage L
+ and, if it can, how can this be done? 相似文献
14.
Reduced K-means (RKM) and Factorial K-means (FKM) are two data reduction techniques incorporating principal component analysis and K-means into a unified methodology to obtain a reduced set of components for variables and an optimal partition for objects.
RKM finds clusters in a reduced space by maximizing the between-clusters deviance without imposing any condition on the within-clusters
deviance, so that clusters are isolated but they might be heterogeneous. On the other hand, FKM identifies clusters in a reduced
space by minimizing the within-clusters deviance without imposing any condition on the between-clusters deviance. Thus, clusters
are homogeneous, but they might not be isolated. The two techniques give different results because the total deviance in the
reduced space for the two methodologies is not constant; hence the minimization of the within-clusters deviance is not equivalent
to the maximization of the between-clusters deviance. In this paper a modification of the two techniques is introduced to
avoid the afore mentioned weaknesses. It is shown that the two modified methods give the same results, thus merging RKM and
FKM into a new methodology. It is called Factor Discriminant K-means (FDKM), because it combines Linear Discriminant Analysis and K-means. The paper examines several theoretical properties of FDKM and its performances with a simulation study. An application
on real-world data is presented to show the features of FDKM. 相似文献
15.
The process of abstraction and concretisation is a label used for an explicative theory of scientific model-construction. In scientific theorising this process enters
at various levels. We could identify two principal levels of abstraction that are useful to our understanding of theory-application.
The first level is that of selecting a small number of variables and parameters abstracted from the universe of discourse
and used to characterise the general laws of a theory. In classical mechanics, for example, we select position and momentum and establish a relation amongst the two variables, which we call Newton’s 2nd law. The specification of the unspecified
elements of scientific laws, e.g. the force function in Newton’s 2nd law, is what would establish the link between the assertions
of the theory and physical systems. In order to unravel how and with what conceptual resources scientific models are constructed,
how they function and how they relate to theory, we need a view of theory-application that can accommodate our constructions
of representation models. For this we need to expand our understanding of the process of abstraction to also explicate the
process of specifying force functions etc. This is the second principal level at which abstraction enters in our theorising
and in which I focus. In this paper, I attempt to elaborate a general analysis of the process of abstraction and concretisation
involved in scientific- model construction, and argue why it provides an explication of the construction of models of the
nuclear structure. 相似文献
16.
MCLUST is a software package for model-based clustering, density estimation
and discriminant analysis interfaced to the S-PLUS commercial software and the R language.
It implements parameterized Gaussian hierarchical clustering algorithms and the
EM algorithm for parameterized Gaussian mixture models with the possible addition of a
Poisson noise term. Also included are functions that combine hierarchical clustering, EM
and the Bayesian Information Criterion (BIC) in comprehensive strategies for clustering,
density estimation, and discriminant analysis. MCLUST provides functionality for displaying
and visualizing clustering and classification results. A web page with related links can
be found at . 相似文献
17.
18.
19.
20.
NP-hard Approximation Problems in Overlapping Clustering 总被引:1,自引:1,他引:0
Lp
-norm (p < ∞). These problems also correspond to the approximation by a strongly Robinson dissimilarity or by a dissimilarity fulfilling
the four-point inequality (Bandelt 1992; Diatta and Fichet 1994). The results are extended to circular strongly Robinson dissimilarities,
indexed k-hierarchies (Jardine and Sibson 1971, pp. 65-71), and to proper dissimilarities satisfying the Bertrand and Janowitz (k + 2)-point inequality (Bertrand and Janowitz 1999). Unidimensional scaling (linear or circular) is reinterpreted as a clustering
problem and its hardness is established, but only for the L
1 norm. 相似文献