共查询到20条相似文献,搜索用时 15 毫秒
1.
We devise a classification algorithm based on generalized linear mixed model (GLMM) technology. The algorithm incorporates
spline smoothing, additive model-type structures and model selection. For reasons of speed we employ the Laplace approximation,
rather than Monte Carlo methods. Tests on real and simulated data show the algorithm to have good classification performance.
Moreover, the resulting classifiers are generally interpretable and parsimonious. 相似文献
2.
In the framework of incomplete data analysis, this paper provides a nonparametric approach to missing data imputation based on Information Retrieval. In particular, an incremental procedure based on the iterative use of tree-based method is proposed and a suitable Incremental Imputation Algorithm is introduced. The key idea is to define a lexicographic ordering of cases and variables so that conditional mean imputation via binary trees can be performed incrementally. A simulation study and real data applications are carried out to describe the advantages and the performance with respect to standard approaches. 相似文献
3.
Cognitive diagnostic models provide valuable information on whether a student has mastered each of the attributes a test intends to evaluate. Despite its generality, the generalized DINA model allows for the possibility of lower correct rates for students who master more attributes than those who know less. This paper considers the use of order-constrained parameter space of the G-DINA model to avoid such a counter-intuitive phenomenon and proposes two algorithms, the upward and downward methods, for parameter estimation. Through simulation studies, we compare the accuracy in parameter estimation and in classification of attribute patterns obtained from the proposed two algorithms and the current approach when the restricted parameter space is true. Our results show that the upward method performs the best among the three, and therefore it is recommended for estimation, regardless of the distribution of respondents’ attribute patterns, types of test items, and the sample size of the data. 相似文献
4.
5.
6.
Recognizing the successes of treed Gaussian process (TGP) models as an interpretable and thrifty model for nonparametric regression,
we seek to extend the model to classification. Both treed models and Gaussian processes (GPs) have, separately, enjoyed great
success in application to classification problems. An example of the former is Bayesian CART. In the latter, real-valued GP
output may be utilized for classification via latent variables, which provide classification rules by means of a softmax function.
We formulate a Bayesian model averaging scheme to combine these two models and describe a Monte Carlo method for sampling
from the full posterior distribution with joint proposals for the tree topology and the GP parameters corresponding to latent variables at the leaves. We concentrate on efficient sampling of the latent variables,
which is important to obtain good mixing in the expanded parameter space. The tree structure is particularly helpful for this
task and also for developing an efficient scheme for handling categorical predictors, which commonly arise in classification
problems. Our proposed classification TGP (CTGP) methodology is illustrated on a collection of synthetic and real data sets.
We assess performance relative to existing methods and thereby show how CTGP is highly flexible, offers tractable inference,
produces rules that are easy to interpret, and performs well out of sample. 相似文献
7.
8.
A common practice in cross validation research in the behavioral sciences is to employ either the product moment correlation
or a simple tabulation of first-choice “hits” for measuring the accuracy with which various preference models predict subjects’
responses to a holdout sample of choice objects.
We propose a nonparametric approach for summarizing the accuracy of predicted rankings across a set of holdout-sample options.
The methods that we develop contain a novel way to deal with ties and an approach to the different weighting of rank positions. 相似文献
9.
10.
11.
Marek Ancukiewicz 《Journal of Classification》1998,15(1):129-141
I consider a new problem of classification into n(n ≥ 2) disjoint classes based on features of unclassified data. It is assumed that the data are grouped into m(M ≥ n) disjoint sets and within each set the distribution of features is a mixture of distributions corresponding to particular
classes. Moreover, the mixing proportions should be known and form a matrix of rank n. The idea of solution is, first, to estimate feature densities in all the groups, then to solve the linear system for component
densities. The proposed classification method is asymptotically optimal, provided a consistent method of density estimation
is used. For illustration, the method is applied to determining perfusion status in myocardial infarction patients, using
creatine kinase measurements. 相似文献
12.
13.
14.
Nedret Billor Asheber Abebe Asuman Turkmen Sai V. Nudurupati 《Journal of Classification》2008,25(2):249-260
Suppose y, a d-dimensional (d ≥ 1) vector, is drawn from a mixture of k (k ≥ 2) populations, given by ∏1, ∏2,…,∏
k
. We wish to identify the population that is the most likely source of the point y. To solve this classification problem many classification rules have been proposed in the literature. In this study, a new
nonparametric classifier based on the transvariation probabilities of data depth is proposed. We compare the performance of
the newly proposed nonparametric classifier with classical and maximum depth classifiers using some benchmark and simulated
data sets.
The authors thank the editor and referees for comments that led to an improvement of this paper. This work is partially supported
by the National Science Foundation under Grant No. DMS-0604726.
Published online xx, xx, xxxx. 相似文献
15.
16.
Donatella Vicari 《Journal of Classification》2014,31(3):386-420
When clustering asymmetric proximity data, only the average amounts are often considered by assuming that the asymmetry is due to noise. But when the asymmetry is structural, as typically may happen for exchange flows, migration data or confusion data, this may strongly affect the search for the groups because the directions of the exchanges are ignored and not integrated in the clustering process. The clustering model proposed here relies on the decomposition of the asymmetric dissimilarity matrix into symmetric and skew-symmetric effects both decomposed in within and between cluster effects. The classification structures used here are generally based on two different partitions of the objects fitted to the symmetric and the skew-symmetric part of the data, respectively; the restricted case is also presented where the partition fits jointly both of them allowing for clusters of objects similar with respect to the average amounts and directions of the data. Parsimonious models are presented which allow for effective and simple graphical representations of the results. 相似文献
17.
18.
Call for Abstracts
Annual Conference of the Classification Society 相似文献19.
20.
Herbert K.H. Lee 《Journal of Classification》2007,24(1):53-70
Feedforward neural networks are a popular tool for classification, offering a method for fully flexible modeling. This paper
looks at the underlying probability model, so as to understand statistically what is going on in order to facilitate an intelligent
choice of prior for a fully Bayesian analysis. The parameters turn out to be difficult or impossible to interpret, and yet
a coherent prior requires a quantification of this inherent uncertainty. Several approaches are discussed, including flat
priors, Jeffreys priors and reference priors. 相似文献