首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 31 毫秒
a posteriori blockmodeling for graphs is proposed. The model assumes that the vertices of the graph are partitioned into two unknown blocks and that the probability of an edge between two vertices depends only on the blocks to which they belong. Statistical procedures are derived for estimating the probabilities of edges and for predicting the block structure from observations of the edge pattern only. ML estimators can be computed using the EM algorithm, but this strategy is practical only for small graphs. A Bayesian estimator, based on the Gibbs sampling, is proposed. This estimator is practical also for large graphs. When ML estimators are used, the block structure can be predicted based on predictive likelihood. When Gibbs sampling is used, the block structure can be predicted from posterior predictive probabilities. A side result is that when the number of vertices tends to infinity while the probabilities remain constant, the block structure can be recovered correctly with probability tending to 1.  相似文献   

In educational measurement, cognitive diagnosis models have been developed to allow assessment of specific skills that are needed to perform tasks. Skill knowledge is characterized as present or absent and represented by a vector of binary indicators, or the skill set profile. After determining which skills are needed for each assessment item, a model is specified for the relationship between item responses and skill set profiles. Cognitive diagnosis models are often used for diagnosis, that is, for classifying students into the different skill set profiles. Generally, cognitive diagnosis models do not exploit student covariate information. However, investigating the effects of student covariates, such as gender, SES, or educational interventions, on skill knowledge mastery is important in education research, and covariate information may improve classification of students to skill set profiles. We extend a common cognitive diagnosis model, the DINA model, by modeling the relationship between the latent skill knowledge indicators and covariates. The probability of skill mastery is modeled as a logistic regression model, possibly with a student-level random intercept, giving a higher-order DINA model with a latent regression. Simulations show that parameter recovery is good for these models and that inclusion of covariates can improve skill diagnosis. When applying our methods to data from an online tutor, we obtain reasonable and interpretable parameter estimates that allow more detailed characterization of groups of students who differ in their predicted skill set profiles.  相似文献   

对关孝和<天文数学杂著>中的天文历算工作进行较为全面的研究,特别指出了"<授时历>求五星定合定积定星校正图解"中的"改正术"与<授时历经>记载大致相同;"日景实测"中的"独特算法"与和算中求圆周率术和求弧背术有密切联系;<元史>中已论及盈缩和迟疾二种不均匀改正,并且<元史>中所述内容包括了关孝和定交日和交定度算法的改正术.图解是关氏历算工作的精华.关孝和亲自观测和校验,以其独特的视角,融合中国古代的数理天文方法,进一步完善了<授时历>的历算工作,有的还有所超越,在日本开辟出新的研究领域.关氏的天文历算工作代表了日本传统历算的最高水平.  相似文献   

Suppose that we rank-order the conditional probabilities for a group of subjects that are provided from a Bayesian network (BN) model of binary variables. The conditional probability is the probability that a subject has a certain attribute given an outcome of some other variables and the classification is based on the rank-order. Under the condition that the class sizes are equal across the class levels and that all the variables in the model are positively associated with each other, we compared the classification results between models of binary variables which share the same model structure. In the comparison, we used a BN model, called a similar BN model, which was constructed under some rule based on a set of BN models satisfying certain conditions. Simulation results indicate that the agreement level of the classification between a set of BN models and their corresponding similar BN model is considerably high with the exact agreement for about half of the subjects or more and the agreement up to one-class-level difference for about 90% or more.  相似文献   

波普尔认为,科学知识即理论内容的增长是科学进步的最为重要的标志。然而,科学理论的内容丰富程度与逻辑概率之间正好是反变关系,因此,科学的目标不是追求理论的高概率,而是追求理论的低概率,不是追求理论的可证实性,而是追求理论的可证伪性。既然归纳推理是确立结论真实性或概然性的推理,所以,归纳推理是与科学目标背道而驰的,因而应当将它从科学方法论中清除出去,相应地,休谟提出的归纳问题也就一并被取消了。然而,当波普尔引入“逼真性”概念以后,他的验证方法便不可能是完全演绎的,或多或少地含有归纳的成分。因此,波普尔对归纳问题的取消是不成功的。  相似文献   

Syntactic and structural models specify relationships between their constituents but cannot show what outcomes their interaction would produce over time in the world. Simulation consists in iterating the states of a model, so as to produce behaviour over a period of simulated time. Iteration enables us to trace the implications and outcomes of inference rules and other assumptions implemented in the models that make up a theory. We apply this method to experiments which we treat as models of the particular aspects of reality they are designed to investigate. Scientific experiments are constantly designed and re-designed in the context of implementation and use. They mediate between theoretical understanding and the practicalities of engaging with the empirical and social world. In order to model experiments we need to identify and represent features that all experiments have in common. We treat these features as parameters of a general model of experiment so that by varying these parameters different types of experiment can be modelled.
D. C. GoodingEmail:

Recognizing the successes of treed Gaussian process (TGP) models as an interpretable and thrifty model for nonparametric regression, we seek to extend the model to classification. Both treed models and Gaussian processes (GPs) have, separately, enjoyed great success in application to classification problems. An example of the former is Bayesian CART. In the latter, real-valued GP output may be utilized for classification via latent variables, which provide classification rules by means of a softmax function. We formulate a Bayesian model averaging scheme to combine these two models and describe a Monte Carlo method for sampling from the full posterior distribution with joint proposals for the tree topology and the GP parameters corresponding to latent variables at the leaves. We concentrate on efficient sampling of the latent variables, which is important to obtain good mixing in the expanded parameter space. The tree structure is particularly helpful for this task and also for developing an efficient scheme for handling categorical predictors, which commonly arise in classification problems. Our proposed classification TGP (CTGP) methodology is illustrated on a collection of synthetic and real data sets. We assess performance relative to existing methods and thereby show how CTGP is highly flexible, offers tractable inference, produces rules that are easy to interpret, and performs well out of sample.  相似文献   

The Deterministic Input Noisy Output “AND” gate (DINA) model and the Deterministic Input Noisy Output “OR” gate (DINO) model are two popular cognitive diagnosis models (CDMs) for educational assessment. They represent different views on how the mastery of cognitive skills and the probability of a correct item response are related. Recently, however, Liu, Xu, and Ying demonstrated that the DINO model and the DINA model share a “dual” relation. This means that one model can be expressed in terms of the other, and which of the two models is fitted to a given data set is essentially irrelevant because the results are identical. In this article, a proof of the duality of the DINA model and the DINO model is presented that is tailored to the form and parameterization of general CDMs that have become the new theoretical standard in cognitively diagnostic modeling.  相似文献   

This paper studies the random indexed dendograms produced by agglomerative hierarchical algorithms under the non-classifiability hypothesis of independent identically distributed (i.i.d.) dissimilarities. New tests for classifiability are deduced. The corresponding test statistics are random variables attached to the indexed dendrograms, such as the indices, the survival time of singletons, the value of the ultrametric between two given points, or the size of classes in the different levels of the dendogram. For an indexed dendogram produced by the Single Link method on i.i.d. dissimilarities, the distribution of these random variables is computed, thus leading to explicit tests. For the case of the Average and Complete Link methods, some asymptotic results are presented. The proofs rely essentially on the theory of random graphs.  相似文献   

We argue from the Church-Turing thesis (Kleene Mathematical logic. New York: Wiley 1967) that a program can be considered as equivalent to a formal language similar to predicate calculus where predicates can be taken as functions. We can relate such a calculus to Wittgenstein’s first major work, the Tractatus, and use the Tractatus and its theses as a model of the formal classical definition of a computer program. However, Wittgenstein found flaws in his initial great work and he explored these flaws in a new thesis described in his second great work; the Philosophical Investigations. The question we address is “can computer science make the same leap?” We are proposing, because of the flaws identified by Wittgenstein, that computers will never have the possibility of natural communication with people unless they become active participants of human society. The essential difference between formal models used in computing and human communication is that formal models are based upon rational sets whereas people are not so restricted. We introduce irrational sets as a concept that requires the use of an abductive inference system. However, formal models are still considered central to our means of using hypotheses through deduction to make predictions about the world. These formal models are required to continually be updated in response to peoples’ changes in their way of seeing the world. We propose that one mechanism used to keep track of these changes is the Peircian abductive loop.  相似文献   

本文在亨普尔的律则解释模式——演绎-律则模式和归纳-统计模式适用于社会科学的假设下,通过对相关性问题以及高概要求的讨论,考察了I—S模式对社会科学的适用性。  相似文献   

Multiple choice items on tests and Likert items on surveys are ubiquitous in educational, social and behavioral science research; however, methods for analyzing of such data can be problematic. Multidimensional item response theory models are proposed that yield structured Poisson regression models for the joint distribution of responses to items. The methodology presented here extends the approach described in Anderson, Verkuilen, and Peyton (2010) that used fully conditionally specified multinomial logistic regression models as item response functions. In this paper, covariates are added as predictors of the latent variables along with covariates as predictors of location parameters. Furthermore, the models presented here incorporate ordinal information of the response options thus allowing an empirical examination of assumptions regarding the ordering and the estimation of optimal scoring of the response options. To illustrate the methodology and flexibility of the models, data from a study on aggression in middle school (Espelage, Holt, and Henkel 2004) is analyzed. The models are fit to data using SAS.  相似文献   

This study aims to understand scientific inference for the evolutionary procedure of Continental Drift based on abductive inference, which is important for creative inference and scientific discovery during problem solving. We present the following two research problems: (1) we suggest a scientific inference procedure as well as various strategies and a criterion for choosing hypotheses over other competing or previous hypotheses; aspects of this procedure include puzzling observation, abduction, retroduction, updating, deduction, induction, and recycle; and (2) we analyze the “theory of continental drift” discovery, called the Earth science revolution, using our multistage inference procedure. Wegener’s Continental Drift hypothesis had an impact comparable to the revolution caused by Darwin’s theory of evolution in biology. Finally, the suggested inquiry inference model can provide us with a more consistent view of science and promote a deeper understanding of scientific concepts.  相似文献   

Dimensionally reduced model-based clustering methods are recently receiving a wide interest in statistics as a tool for performing simultaneously clustering and dimension reduction through one or more latent variables. Among these, Mixtures of Factor Analyzers assume that, within each component, the data are generated according to a factor model, thus reducing the number of parameters on which the covariance matrices depend. In Factor Mixture Analysis clustering is performed through the factors of an ordinary factor analysis which are jointly modelled by a Gaussian mixture. The two approaches differ in genesis, parameterization and consequently clustering performance. In this work we propose a model which extends and combines them. The proposed Mixtures of Factor Mixture Analyzers provide a unified class of dimensionally reduced mixture models which includes the previous ones as special cases and could offer a powerful tool for modelling non-Gaussian latent variables.  相似文献   

This report extends earlier work by Brailovsky on regression theory and methodology, giving particular emphasis to function approximation for incompletely specified models. The interest here is with situations where the form of the regression relation is not known in advance. We discuss several difficulties that arise in using local approximation and linear regression methods, and propose ways to overcome these problems. To aid the data analyst in developing a suitable model, an illustrative table is derived for determining the number of initial explanatory functions justifiable for a given prespecified confidence level. The general approach formulated here is illustrated with an application to medical data. Relevance to classification and possible extensions are discussed.  相似文献   

How can new drug lead suggestions beinferred from neurophysiological models? This paperaddresses this question based on a case study ofresearch into Parkinson's disease at the GroningenUniversity Department of Pharmacy. It is argued thatneurophysiological box-and-arrow models can beunderstood as qualitative differential equationmodels. An inference task is defined to helpunderstand and possibly aid the discovery andexplanation of new drug lead suggestions.  相似文献   

中国传统科学中“取象比类”的实质和意义   总被引:7,自引:1,他引:6  
中国传统科学的“取象比类”方法,长期以来被许多人看成类比推理,因而被认为具有某种或然性。该文指出,“取象比类”实际上是一种特殊的抽象思维过程,它充分体现了中国传统科学范畴感性成分与理性成分相互渗透的特点,在直观体验活动中起着关键作用。  相似文献   

We discuss a generalization of the standard notion of probability space and show that the emerging framework, to be called operational probability theory, can be considered as underlying quantal theories. The proposed framework makes special reference to the convex structure of states and to a family of observables which is wider than the familiar set of random variables: it appears as an alternative to the known algebraic approach to quantum probability.  相似文献   

论心智逻辑理论与心智模型理论融合的可能途径   总被引:3,自引:0,他引:3  
文献表明,到目前为止,还没有一个全面的、精确的和统一的研究人类推理心理学理论,而心智逻辑理论和心智模型理论是近20年来发展最快的两个主流理论。心智逻辑理论认为人类运用推理图式进行推理;心智模型理论则认为人类通过构造心智模型实现推理。本文讨论了两个理论各自的适用范围和局限性,并提出未来两理论融合的可能方向。  相似文献   

Based on the notion of mutual information between the components of a random vector, we construct, for data reduction reasons, an optimal quantization of the support of its probability measure. More precisely, we propose a simultaneous discretization of the whole set of the components of the random vector which takes into account, as much as possible, the stochastic dependence between them. Examples are presented.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号