期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Incorporating Student Covariates in Cognitive Diagnosis Models

Elizabeth Ayers Sophia Rabe-Hesketh Rebecca Nugent 《Journal of Classification》2013,30(2):195-224

In educational measurement, cognitive diagnosis models have been developed to allow assessment of specific skills that are needed to perform tasks. Skill knowledge is characterized as present or absent and represented by a vector of binary indicators, or the skill set profile. After determining which skills are needed for each assessment item, a model is specified for the relationship between item responses and skill set profiles. Cognitive diagnosis models are often used for diagnosis, that is, for classifying students into the different skill set profiles. Generally, cognitive diagnosis models do not exploit student covariate information. However, investigating the effects of student covariates, such as gender, SES, or educational interventions, on skill knowledge mastery is important in education research, and covariate information may improve classification of students to skill set profiles. We extend a common cognitive diagnosis model, the DINA model, by modeling the relationship between the latent skill knowledge indicators and covariates. The probability of skill mastery is modeled as a logistic regression model, possibly with a student-level random intercept, giving a higher-order DINA model with a latent regression. Simulations show that parameter recovery is good for these models and that inclusion of covariates can improve skill diagnosis. When applying our methods to data from an online tutor, we obtain reasonable and interpretable parameter estimates that allow more detailed characterization of groups of students who differ in their predicted skill set profiles. 相似文献

2.

Taxonomy Based Models for Reasoning: Making Inferences from Electronic Road Sign Information

Brigitte Cambon de Lavalette Charles Tijus Christine Leproux Olivier Bauer 《Foundations of Science》2005,10(1):25-45

Taxonomy Based modeling was applied to describe drivers’ mental models of variable message signs (VMS’s) displayed on expressways. Progress in road telematics has made it possible to introduce variable message signs (VMS’s). Sensors embedded in the carriageway every 500m record certain variables (speed, flow rate, etc.) that are transformed in real time into “driving times” to a given destination if road conditions do not change. VMS systems are auto-regulative Man-Machine (AMMI) systems which incorporate a model of the user: if the traffic flow is too high, then drivers should choose alternative routes. In so doing, the traffic flow should decrease. The model of the user is based on suppositions such as: people do not like to waste time, they fully understand the displayed messages, they trust the displayed values, they know of alternative routes. However, people also have a model of the way the system functions. And if they do not believe the contents of the message, they will not act as expected. We collected data through interviews with drivers using the critical incidents technique (Flanagan, 1985). Results show that the mental models that drivers have of the way the VMS system works are various but not numerous and that most of them differ from the“ideal expert” mental model. It is clear that users don’t have an adequate model of how the VMS system works and that VMS planners have a model of user behaviour that does not correspond to the behaviour of the drivers we interviewed. Finally, Taxonomy Based Modeling is discussed as a tool for mental model remediation. 相似文献

3.

Functional PCA and Base-Line Logit Models

Manuel Escabias Ana M. Aguilera M. Carmen Aguilera-Morillo 《Journal of Classification》2014,31(3):296-324

In many statistical applications data are curves measured as functions of a continuous parameter as time. Despite of their functional nature and due to discrete-time observation, these type of data are usually analyzed with multivariate statistical methods that do not take into account the high correlation between observations of a single curve at nearby time points. Functional data analysis methodologies have been developed to solve these type of problems. In order to predict the class membership (multi-category response variable) associated to an observed curve (functional data), a functional generalized logit model is proposed. Base-line category logit formulations will be considered and their estimation based on basis expansions of the sample curves of the functional predictor and parameters. Functional principal component analysis will be used to get an accurate estimation of the functional parameters and to classify sample curves in the categories of the response variable. The good performance of the proposed methodology will be studied by developing an experimental study with simulated and real data. 相似文献

4.

术语的启智功能

孙寰《科技术语研究》2009,(4):22-26

术语与一般通用词汇一样，也具有交际和认知两项主要功能，但术语作为指称概念的专业词汇，认知功能是它最为主要的功能，其他的种种功能基本上可以相应地归到这两个功能范畴中。在术语的认知功能范畴中，启智功能又是最能够体现术语特点的一种功能。术语的启智功能还可以进一步划分为系统化功能、模式化功能和预示功能。相似文献

5.

Distributional Equivalence and Subcompositional Coherence in the Analysis of Compositional Data,Contingency Tables and Ratio-Scale Measurements

Michael Greenacre Paul Lewi 《Journal of Classification》2009,26(1):29-54

We consider two fundamental properties in the analysis of two-way tables of positive data: the principle of distributional equivalence, one of the cornerstones of correspondence analysis of contingency tables, and the principle of subcompositional coherence, which forms the basis of compositional data analysis. For an analysis to be subcompositionally coherent, it suffices to analyze the ratios of the data values. A common approach to dimension reduction in compositional data analysis is to perform principal component analysis on the logarithms of ratios, but this method does not obey the principle of distributional equivalence. We show that by introducing weights for the rows and columns, the method achieves this desirable property and can be applied to a wider class of methods. This weighted log-ratio analysis is theoretically equivalent to “spectral mapping”, a multivariate method developed almost 30 years ago for displaying ratio-scale data from biological activity spectra. The close relationship between spectral mapping and correspondence analysis is also explained, as well as their connection with association modeling. The weighted log-ratio methodology is used here to visualize frequency data in linguistics and chemical compositional data in archeology. The first author acknowledges research support from the Fundación BBVA in Madrid as well as partial support by the Spanish Ministry of Education and Science, grant MEC-SEJ2006-14098. The constructive comments of the referees, who also brought additional relevant literature to our attention, significantly improved our article. 相似文献

6.

术语的启智功能 总被引：1，自引：0，他引：1

孙寰《中国科技术语》2009,11(4):22-26

术语与一般通用词汇一样,也具有交际和认知两项主要功能,但术语作为指称概念的专业词汇,认知功能是它最为主要的功能,其他的种种功能基本上可以相应地归到这两个功能范畴中。在术语的认知功能范畴中,启智功能又是最能够体现术语特点的一种功能。术语的启智功能还可以进一步划分为系统化功能、模式化功能和预示功能。相似文献

7.

Oscillation Heuristics for the Two-group Classification Problem

Ognian Asparouhov Paul A. Rubin 《Journal of Classification》2004,21(2):255-277

We propose a new nonparametric family of oscillation heuristics for improving linear classifiers in the two-group discriminant problem. The heuristics are motivated by the intuition that the classification accuracy of a separating hyperplane can be improved through small perturbations to its slope and position, accomplished by substituting training observations near the hyperplane for those used to generate it. In an extensive simulation study, using data generated from multivariate normal distributions under a variety of conditions, the oscillation heuristics consistently improve upon the classical linear and logistic discriminant functions, as well as two published linear programming-based heuristics and a linear Support Vector Machine. Added to any of the methods above, they approach, and frequently attain, the best possible accuracy on the training samples, as determined by a mixed-integer programming (MIP) model, at a much smaller computational cost. They also improve expected accuracy on the overall populations when the populations overlap significantly and the heuristics are trained with large samples, at least in situations where the data conditions do not explicitly favor a particular classifier. 相似文献

8.

The Remarkable Simplicity of Very High Dimensional Data: Application of Model-Based Clustering

Fionn Murtagh 《Journal of Classification》2009,26(3):249-277

An ultrametric topology formalizes the notion of hierarchical structure. An ultrametric embedding, referred to here as ultrametricity, is implied by a hierarchical embedding. Such hierarchical structure can be global in the data set, or local. By quantifying extent or degree of ultrametricity in a data set, we show that ultrametricity becomes pervasive as dimensionality and/or spatial sparsity increases. This leads us to assert that very high dimensional data are of simple structure. We exemplify this finding through a range of simulated data cases. We discuss also application to very high frequency time series segmentation and modeling. 相似文献

9.

Functional Cluster Analysis via Orthonormalized Gaussian Basis Expansions and Its Application 总被引：1，自引：1，他引：0

Mitsunori Kayano Koji Dozono Sadanori Konishi 《Journal of Classification》2010,27(2):211-230

We propose functional cluster analysis (FCA) for multidimensional functional data sets, utilizing orthonormalized Gaussian basis functions. An essential point in FCA is the use of orthonormal bases that yield the identity matrix for the integral of the product of any two bases. We construct orthonormalized Gaussian basis functions using Cholesky decomposition and derive a property of Cholesky decomposition with respect to Gram-Schmidt orthonormalization. The advantages of the functional clustering are that it can be applied to the data observed at different time points for each subject, and the functional structure behind the data can be captured by removing the measurement errors. Numerical experiments are conducted to investigate the effectiveness of the proposed method, as compared to conventional discrete cluster analysis. The proposed method is applied to three-dimensional (3D) protein structural data that determine the 3D arrangement of amino acids in individual protein. 相似文献

10.

Free Knot Splines for Supervised Classification

Nicolas Molinari 《Journal of Classification》2007,24(2):221-234

Data in many different fields come to practitioners through a process naturally described as functional. We propose a classification procedure of oxidation curves. Our algorithm is based on two stages: fitting the functional data by linear splines with free knots and classifying the estimated knots which estimate useful oxidation parameters. A real data set on 57 oxidation curves is used to illustrate our approach. 相似文献

11.

Piecewise Regression Mixture for Simultaneous Functional Data Clustering and Optimal Segmentation

Faicel Chamroukhi 《Journal of Classification》2016,33(3):374-411

This paper introduces a novel mixture model-based approach to the simultaneous clustering and optimal segmentation of functional data, which are curves presenting regime changes. The proposed model consists of a finite mixture of piecewise polynomial regression models. Each piecewise polynomial regression model is associated with a cluster, and within each cluster, each piecewise polynomial component is associated with a regime (i.e., a segment). We derive two approaches to learning the model parameters: the first is an estimation approach which maximizes the observed-data likelihood via a dedicated expectation-maximization (EM) algorithm, then yielding a fuzzy partition of the curves into K clusters obtained at convergence by maximizing the posterior cluster probabilities. The second is a classification approach and optimizes a specific classification likelihood criterion through a dedicated classification expectation-maximization (CEM) algorithm. The optimal curve segmentation is performed by using dynamic programming. In the classification approach, both the curve clustering and the optimal segmentation are performed simultaneously as the CEM learning proceeds. We show that the classification approach is a probabilistic version generalizing the deterministic K-means-like algorithm proposed in Hébrail, Hugueney, Lechevallier, and Rossi (2010). The proposed approach is evaluated using simulated curves and real-world curves. Comparisons with alternatives including regression mixture models and the K-means-like algorithm for piecewise regression demonstrate the effectiveness of the proposed approach. 相似文献

12.

Bayesian Regularization for Normal Mixture Estimation and Model-Based Clustering

Chris Fraley Adrian E. Raftery 《Journal of Classification》2007,24(2):155-181

Normal mixture models are widely used for statistical modeling of data, including cluster analysis. However maximum likelihood estimation (MLE) for normal mixtures using the EM algorithm may fail as the result of singularities or degeneracies. To avoid this, we propose replacing the MLE by a maximum a posteriori (MAP) estimator, also found by the EM algorithm. For choosing the number of components and the model parameterization, we propose a modified version of BIC, where the likelihood is evaluated at the MAP instead of the MLE. We use a highly dispersed proper conjugate prior, containing a small fraction of one observation's worth of information. The resulting method avoids degeneracies and singularities, but when these are not present it gives similar results to the standard method using MLE, EM and BIC. 相似文献

13.

A Proof of the Duality of the DINA Model and the DINO Model

Hans-Friedrich Köhn Chia-Yi Chiu 《Journal of Classification》2016,33(2):171-184

The Deterministic Input Noisy Output “AND” gate (DINA) model and the Deterministic Input Noisy Output “OR” gate (DINO) model are two popular cognitive diagnosis models (CDMs) for educational assessment. They represent different views on how the mastery of cognitive skills and the probability of a correct item response are related. Recently, however, Liu, Xu, and Ying demonstrated that the DINO model and the DINA model share a “dual” relation. This means that one model can be expressed in terms of the other, and which of the two models is fitted to a given data set is essentially irrelevant because the results are identical. In this article, a proof of the duality of the DINA model and the DINO model is presented that is tailored to the form and parameterization of general CDMs that have become the new theoretical standard in cognitively diagnostic modeling. 相似文献

14.

Mechanistic Explanations and Models in Molecular Systems Biology

Fred C. Boogerd Frank J. Bruggeman Robert C. Richardson 《Foundations of Science》2013,18(4):725-744

Mechanistic models in molecular systems biology are generally mathematical models of the action of networks of biochemical reactions, involving metabolism, signal transduction, and/or gene expression. They can be either simulated numerically or analyzed analytically. Systems biology integrates quantitative molecular data acquisition with mathematical models to design new experiments, discriminate between alternative mechanisms and explain the molecular basis of cellular properties. At the heart of this approach are mechanistic models of molecular networks. We focus on the articulation and development of mechanistic models, identifying five constraints which guide the articulation of models in molecular systems biology. These constraints are not independent of one another, with the result that modeling becomes an iterative process. We illustrate the use of these constraints in the modeling of the mechanism for bistability in the lac operon. 相似文献

15.

类比建模的贝叶斯分析

胡浩《自然辩证法研究》2011,(4)

通过质疑基于模型推理的认知论题,尝试对在自然化认识论纲领下的认知-历史分析方法进行规范性研究。类比建模是基于模型(Model-based)推理的主要形式之一,类比建模的基本机制主要包括两个部分:一是对模型来源的泛化抽象,二是基于目标域的特征对模型来源进行限制或修正。这两步反复操作,最终构造出适用于目标对象域的模型。模型与对象域的适切性(fitness)则是对以上机制恰当性的基本评价标准,类型层级理论对相似性和差异性的分析,为测度适切性提供了一条可操作的方法。基于类型层级理论,并结合贝叶斯方法可以解释类比建模何以能够提高模型的可信度,以及类比的创造性与科学合理性之间的关系。这一工作对基于模型推理的科学认知论题的提出了一种可能的反驳。相似文献

16.

还原模型与功能主义——兼评金在权的还原的物理主义

陈晓平《自然辩证法通讯》2011,(4)

首先对内格尔的经典还原模型和金在权的功能还原模型做了比较,并指出,内格尔模型是在认识论或方法论的层面提出的,而金在权模型是在实践论或本体论的层面提出的;因此,这两个还原模型不是对立的,而是互补的。接着深入到个别事件的内部结构,对金在权的局部还原理论做出分析和评价,并指出其困境和出路。在金在权的三元有序组的事件结构的基础上提出四元有序组的事件结构,据此对功能意义和功能结构作出区分,进一步揭示了功能整体与其实现者之间的随附性关系。最后强调功能实在论的本体论立场,并借助实体-偶性和原因-结果这两对先验范畴对功能实在论和随附性概念给以形而上学的说明和辩护。相似文献

17.

OCLUS: An Analytic Method for Generating Clusters with Known Overlap

Douglas Steinley Robert Henson 《Journal of Classification》2005,22(2):221-250

The primary method for validating cluster analysis techniques is throughMonte Carlo simulations that rely on generating data with known cluster structure (e.g., Milligan 1996). This paper defines two kinds of data generation mechanisms with cluster overlap, marginal and joint; current cluster generation methods are framed within these definitions. An algorithm generating overlapping clusters based on shared densities from several different multivariate distributions is proposed and shown to lead to an easily understandable notion of cluster overlap. Besides outlining the advantages of generating clusters within this framework, a discussion is given of how the proposed data generation technique can be used to augment research into current classification techniques such as finite mixture modeling, classification algorithm robustness, and latent profile analysis. 相似文献

18.

A Procedure for Estimating the Number of Clusters in Logistic Regression Clustering

Guoqi Qian Yuehua Wu Qing Shao 《Journal of Classification》2009,26(2):183-199

This paper studies the problem of estimating the number of clusters in the context of logistic regression clustering. The classification likelihood approach is employed to tackle this problem. A model-selection based criterion for selecting the number of logistic curves is proposed and its asymptotic property is also considered. The small sample performance of the proposed criterion is studied by Monto Carlo simulation. In addition, a real data example is presented. The authors would like to thank the editor, Prof. Willem J. Heiser, and the anonymous referees for the valuable comments and suggestions, which have led to the improvement of this paper. 相似文献

19.

多重实现两难与特殊科学自主性

成骁杰《自然辩证法通讯》2022,(2):38-44

针对多重实现论题,夏皮罗构造了一个两难:在一个多重实现案例中,如果两个物理实现者是同一种类型,那么功能类型实际上被单一实现,功能类型应该被还原为物理类型;如果两个物理实现者是真正不同的类型,那么不存在任何经验定律能够将这两种物理类型归为同一种功能类型,功能类型应该被取消.无论是还原还是取消,研究功能类型的特殊科学的自主... 相似文献

20.

Local Statistical Modeling via a Cluster-Weighted Approach with Elliptical Distributions

Salvatore Ingrassia Simona C. Minotti Giorgio Vittadini 《Journal of Classification》2012,29(3):363-401

Cluster-weighted modeling (CWM) is a mixture approach to modeling the joint probability of data coming from a heterogeneous population. Under Gaussian assumptions, we investigate statistical properties of CWM from both theoretical and numerical point of view; in particular, we show that Gaussian CWM includes mixtures of distributions and mixtures of regressions as special cases. Further, we introduce CWM based on Student-t distributions, which provides a more robust fit for groups of observations with longer than normal tails or noise data. Theoretical results are illustrated using some empirical studies, considering both simulated and real data. Some generalizations of such models are also outlined. 相似文献