共查询到20条相似文献,搜索用时 31 毫秒
1.
Identifiablity of Models for Clusterwise Linear Regression 总被引:3,自引:1,他引:2
C. Hennig 《Journal of Classification》2000,17(2):273-296
The model choice and the interpretation of the parameters are discussed as
well as the use of the identifiability concept for fixed partition models.
The concept is generalized to "partial identifiability". 相似文献
2.
3.
This paper studies the problem of estimating the number of clusters in the context of logistic regression clustering. The classification likelihood approach is employed to tackle this problem. A model-selection based criterion for selecting the number of logistic curves is proposed and its asymptotic property is also considered. The small sample performance of the proposed criterion is studied by Monto Carlo simulation. In addition, a real data example is presented. The authors would like to thank the editor, Prof. Willem J. Heiser, and the anonymous referees for the valuable comments and suggestions, which have led to the improvement of this paper. 相似文献
4.
5.
6.
This paper proposes a maximum clustering similarity (MCS) method for determining the number of clusters in a data set by studying
the behavior of similarity indices comparing two (of several) clustering methods. The similarity between the two clusterings
is calculated at the same number of clusters, using the indices of Rand (R), Fowlkes and Mallows (FM), and Kulczynski (K)
each corrected for chance agreement. The number of clusters at which the index attains its maximum is a candidate for the
optimal number of clusters. The proposed method is applied to simulated bivariate normal data, and further extended for use
in circular data. Its performance is compared to the criteria discussed in Tibshirani, Walther, and Hastie (2001). The proposed
method is not based on any distributional or data assumption which makes it widely applicable to any type of data that can
be clustered using at least two clustering algorithms. 相似文献
7.
The primary method for validating cluster analysis techniques is throughMonte
Carlo simulations that rely on generating data with known cluster structure (e.g., Milligan
1996). This paper defines two kinds of data generation mechanisms with cluster overlap,
marginal and joint; current cluster generation methods are framed within these definitions.
An algorithm generating overlapping clusters based on shared densities from several different
multivariate distributions is proposed and shown to lead to an easily understandable
notion of cluster overlap. Besides outlining the advantages of generating clusters within
this framework, a discussion is given of how the proposed data generation technique can
be used to augment research into current classification techniques such as finite mixture
modeling, classification algorithm robustness, and latent profile analysis. 相似文献
8.
9.
大规模成矿作用与大型矿集区预测研究 总被引:3,自引:0,他引:3
《大规模成矿作用与大型矿集区预测》是国家重点基础研究发展计划 (973计划 )实施以来第一个以固体矿产资源为目标的研究项目。通过 5年 (1 999年 1 0月— 2 0 0 4年 9月 )研究 ,在多项基础地质和矿产资源成矿理论研究方面取得了重要进展 :初步提出了中国中新生代大陆成矿理论体系 ,为预测大矿和大型矿集区奠定了理论基础 ;研制和发展了 4项找矿新技术方法 ,以及提出了两种找矿新思路 ;并在实验阶段圈定了 5个矿集区尺度的找矿靶区 ,发现了一批矿化异常区。此外 ,在研究过程中还形成了 3个国家级优秀科研群体和 3个部门级优秀科研群体 ,培养出 9名优秀中青年人才以及大批博士后工作人员、博士研究生和硕士研究生 ,其中有不少中青年科学家已在国际学术组织任职。研究期间共发表科学论文 772篇 ,其中SCI检索论文2 2 7篇 (国外论文 1 2 5篇 ) 相似文献
10.
Seong Keon Lee 《Journal of Classification》2006,23(1):123-141
In many application fields, multivariate approaches that simultaneously consider the correlation between responses are needed.
The tree method can be extended to multivariate responses, such as repeated measure and longitudinal data, by modifying the
split function so as to accommodate multiple responses. Recently, researchers have constructed some decision trees for multiple
continuous longitudinal response and multiple binary responses using Mahalanobis distance and a generalized entropy index.
However, these methods have limitations according to the type of response, that is, those that are only continuous or binary.
In this paper, we will modify the tree for univariate response procedure and suggest a new tree-based method that can analyze
any type of multiple responses by using GEE (generalized estimating equations) techniques. To compare the performance of trees,
simulation studies on selection probability of true split variable will be shown. Finally, applications using epileptic seizure
data and WWW data are introduced. 相似文献
11.
自然化认识论转换命题的出现,是由于奎因替代命题的失败,于20世纪七八十年代而出现的一种解决策略。文章详细分析了转换命题存在的问题,指出转换命题无法完全实现认识论自然化的任务。根据现代认知科学的发展,认识论的研究必须采取从描述向规范的回归与协同,才能为认识论的发展开辟一个有希望的领域。 相似文献
12.
Latent class (LC) analysis is used by social, behavioral, and medical science researchers among others as a tool for clustering (or unsupervised classification) with categorical response variables, for analyzing the agreement between multiple raters, for evaluating the sensitivity and specificity of diagnostic tests in the absence of a gold standard, and for modeling heterogeneity in developmental trajectories. Despite the increased popularity of LC analysis, little is known about statistical power and required sample size in LC modeling. This paper shows how to perform power and sample size computations in LC models using Wald tests for the parameters describing association between the categorical latent variable and the response variables. Moreover, the design factors affecting the statistical power of these Wald tests are studied. More specifically, we show how design factors which are specific for LC analysis, such as the number of classes, the class proportions, and the number of response variables, affect the information matrix. The proposed power computation approach is illustrated using realistic scenarios for the design factors. A simulation study conducted to assess the performance of the proposed power analysis procedure shows that it performs well in all situations one may encounter in practice. 相似文献
13.
Faicel Chamroukhi 《Journal of Classification》2016,33(3):374-411
This paper introduces a novel mixture model-based approach to the simultaneous clustering and optimal segmentation of functional data, which are curves presenting regime changes. The proposed model consists of a finite mixture of piecewise polynomial regression models. Each piecewise polynomial regression model is associated with a cluster, and within each cluster, each piecewise polynomial component is associated with a regime (i.e., a segment). We derive two approaches to learning the model parameters: the first is an estimation approach which maximizes the observed-data likelihood via a dedicated expectation-maximization (EM) algorithm, then yielding a fuzzy partition of the curves into K clusters obtained at convergence by maximizing the posterior cluster probabilities. The second is a classification approach and optimizes a specific classification likelihood criterion through a dedicated classification expectation-maximization (CEM) algorithm. The optimal curve segmentation is performed by using dynamic programming. In the classification approach, both the curve clustering and the optimal segmentation are performed simultaneously as the CEM learning proceeds. We show that the classification approach is a probabilistic version generalizing the deterministic K-means-like algorithm proposed in Hébrail, Hugueney, Lechevallier, and Rossi (2010). The proposed approach is evaluated using simulated curves and real-world curves. Comparisons with alternatives including regression mixture models and the K-means-like algorithm for piecewise regression demonstrate the effectiveness of the proposed approach. 相似文献
14.
量子计算是建立在量子力学基础上的一种全新的计算,被认为是最有可能突破现有的传统计算设备之计算能力的计算方式。量子计算具有"反直觉"的特点,这使得许多我们日常生活中认为是常识的知识在量子计算中不再成立;同时,我们在一般的认知能力中认为决无可能发生甚至有些"唯心主义"的现象恰恰成为量子计算可能远远超过传统计算的有力证据。量子计算特别是量子计算机近几年的巨大发展启示我们,哲学是科学发展的一个高级阶段,哲学的发展和进步同时也可解释科学并促进科学的发展。科学的发展可能存在着本质的局限性,哲学也许能在更高的层次对这种局限性提出解决方案。 相似文献
15.
Classifications are generally pictured in the form of hierarchical trees, also called dendrograms. A dendrogram is the graphical
representation of an ultrametric (=cophenetic) matrix; so dendrograms can be compared to one another by comparing their cophenetic
matrices. Three methods used in testing the correlation between matrices corresponding to dendrograms are evaluated. The three
permutational procedures make use of different aspects of the information to compare dendrograms: the Mantel procedure permutes
label positions only; the binary tree methods randomize the topology as well; the double-permutation procedure is based on
all the information included in a dendrogram, that is: topology, label positions, and cluster heights. Theoretical and empirical
investigations of these methods are carried out to evaluate their relative performance. Simulations show that the Mantel test
is too conservative when applied to the comparison of dendrograms; the methods of binary tree comparisons do slightly better;
only the doublepermutation test provides unbiased type I error.
Les arbres utilisés pour illustrés les groupements sont généralement représentés sous la forme de classifications hiérarchiques
ou dendrogrammes. Un dendrogramme représente graphiquement l’information contenue dans la matrice ultramétrique (=cophénétique)
correspondant à la classification. Dès ultramétriques correspondantes. Nous comparons trois méthodes permettant d’évaluer
la signification statistique du coefficient de correlation mesuré entre deux matrices ultramétriques. Ces trois tests par
permutations tiennent compte d’aspects différents pour comparer des dendrogrammes: le test de Mantel permute les feuilles
de l’arbre, les méthodes pour arbres binaires permutent les feuilles et la topologie, alors que la procédure à double permutation
permute les feuilles, la topologie et les niveaux de fusion des dendrogrammes comparés. L’efficacité relative des trois méthodes
est évaluée empiriquement et théoriquement. Nos résultats suggèrent l’utilisation préférentielle du test à double permutation
pour la comparaison de dendrogrammes: le test de Mantel s’avère trop conservateur, tandis que les méthodes pour arbres binaires
ne sont pas toujours adéquates.
This work was supported by NSERC grant no. A7738 to Pierre Legendre and by a NSERC scholarship to F.-J. Lapointe. 相似文献
This work was supported by NSERC grant no. A7738 to Pierre Legendre and by a NSERC scholarship to F.-J. Lapointe. 相似文献
16.
17.
理论与事实的关系:从哈金的实验观点看 总被引:4,自引:1,他引:4
观察渗透理论的论题是历史主义科学哲学的共同特点,也是导致历史主义中的相对主义和非理性主义的最重要根源。哈金的实验观蕴涵着个新的关于理论与事实的关系,笔者从哈金的论点出发,试图发展出一套新的观察理论,并得出如下结论:观察不是被理论和范式所决定,而只是受它们的一定程度的影响,事实有它自主的力量,跨范式的观察和理解都是可能的,历史主义者夸大了范式的完整性和自我完备性。 相似文献
18.
19.
福多将计算模块的概念运用于对心灵模块性的分析,提出心灵之负责输入分析的部分(感知觉系统、语言系统等)是模块性的,而心灵之专司信念的确立和思维之职的部分(中心系统)是非模块性的。由此,福多进一步得出心的计算理论不适用于中心系统的结论。然而,福多的结论给他的计算主义的意向实在论辩护带来了问题:如果思维不是计算,那么,福多关于常识心理学所持的意向实在论主张就成了空中楼阁。 相似文献
20.
理想与现实:可持续发展观分类与比较 总被引:7,自引:0,他引:7
与传统意义的发展相比,可持续发展有着深刻的哲理和丰富的内涵。 中外学者对此确有颇多深入的乃至精辟的论述。文章通过分类比较认为,可持续发展无疑是人类的一种崇高的理想,它可以通过理论研究上的有序突破、行为模式上的有机变革和战略实施上的有效执行逐步实现。同时作者也特别注意到可持续发展在现实操作上的难度,即理想与现实之间存在颇大距离,其实施可谓任重而道远。 相似文献