首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 15 毫秒
A clustering that consists of a nested set of clusters may be represented graphically by a tree. In contrast, a clustering that includes non-nested overlapping clusters (sometimes termed a “nonhierarchical” clustering) cannot be represented by a tree. Graphical representations of such non-nested overlapping clusterings are usually complex and difficult to interpret. Carroll and Pruzansky (1975, 1980) suggested representing non-nested clusterings with multiple ultrametric or additive trees. Corter and Tversky (1986) introduced the extended tree (EXTREE) model, which represents a non-nested structure as a tree plus overlapping clusters that are represented by marked segments in the tree. We show here that the problem of finding a nested (i.e., tree-structured) set of clusters in an overlapping clustering can be reformulated as the problem of finding a clique in a graph. Thus, clique-finding algorithms can be used to identify sets of clusters in the solution that can be represented by trees. This formulation provides a means of automatically constructing a multiple tree or extended tree representation of any non-nested clustering. The method, called “clustrees”, is applied to several non-nested overlapping clusterings derived using the MAPCLUS program (Arabie and Carroll 1980).  相似文献   

This paper proposes a maximum clustering similarity (MCS) method for determining the number of clusters in a data set by studying the behavior of similarity indices comparing two (of several) clustering methods. The similarity between the two clusterings is calculated at the same number of clusters, using the indices of Rand (R), Fowlkes and Mallows (FM), and Kulczynski (K) each corrected for chance agreement. The number of clusters at which the index attains its maximum is a candidate for the optimal number of clusters. The proposed method is applied to simulated bivariate normal data, and further extended for use in circular data. Its performance is compared to the criteria discussed in Tibshirani, Walther, and Hastie (2001). The proposed method is not based on any distributional or data assumption which makes it widely applicable to any type of data that can be clustered using at least two clustering algorithms.  相似文献   

The additive biclustering model for two-way two-mode object by variable data implies overlapping clusterings of both the objects and the variables together with a weight for each bicluster (i.e., a pair of an object and a variable cluster). In the data analysis, an additive biclustering model is fitted to given data by means of minimizing a least squares loss function. To this end, two alternating least squares algorithms (ALS) may be used: (1) PENCLUS, and (2) Baier’s ALS approach. However, both algorithms suffer from some inherent limitations, which may hamper their performance. As a way out, based on theoretical results regarding optimally designing ALS algorithms, in this paper a new ALS algorithm will be presented. In a simulation study this algorithm will be shown to outperform the existing ALS approaches.  相似文献   

信息技术革命促使竞争格局剧烈变更并使公众需求更加个性化与多样化,同时也加剧了传统职能型政府的内在矛盾。网络型政府是网络环境中政府改革的必由之路,网络型政府的关键特征在于电子化与智能化的工作流程再造。政府机构通过职能管理向流程管理的转变,会更好的发挥政府竞争优势,实现政府价值取向。  相似文献   

针对多重实现论题,夏皮罗构造了一个两难:在一个多重实现案例中,如果两个物理实现者是同一种类型,那么功能类型实际上被单一实现,功能类型应该被还原为物理类型;如果两个物理实现者是真正不同的类型,那么不存在任何经验定律能够将这两种物理类型归为同一种功能类型,功能类型应该被取消.无论是还原还是取消,研究功能类型的特殊科学的自主...  相似文献   

Over the past decade, diagnostic classification models (DCMs) have become an active area of psychometric research. Despite their use, the reliability of examinee estimates in DCM applications has seldom been reported. In this paper, a reliability measure for the categorical latent variables of DCMs is defined. Using theory-and simulation-based results, we show how DCMs uniformly provide greater examinee estimate reliability than IRT models for tests of the same length, a result that is a consequence of the smaller range of latent variable values examinee estimates can take in DCMs. We demonstrate this result by comparing DCM and IRT reliability for a series of models estimated with data from an end-of-grade test, culminating with a discussion of how DCMs can be used to change the character of large scale testing, either by shortening tests that measure examinees unidimensionally or by providing more reliable multidimensional measurement for tests of the same length.  相似文献   

Many similarity coefficients for binary data are defined as fractions. For certain resemblance measures the denominator may become zero. If the denominator is zero the value of the coefficient is indeterminate. It is shown that the seriousness of the indeterminacy problem differs with the resemblance measures. Following Batagelj and Bren (1995) we remove the indeterminacies by defining appropriate values in critical cases. The author would like to thank three anonymous reviewers for their helpful comments and valuable suggestions on earlier versions of this article.  相似文献   

Circular classifications are classification scales with categories that exhibit a certain periodicity. Since linear scales have endpoints, the standard weighted kappas used for linear scales are not appropriate for analyzing agreement between two circular classifications. A family of kappa coefficients for circular classifications is defined. The kappas differ only in one parameter. It is studied how the circular kappas are related and if the values of the circular kappas depend on the number of categories. It turns out that the values of the circular kappas can be strictly ordered in precisely two ways. The orderings suggest that the circular kappas are measuring the same thing, but to a different extent. If one accepts the use of magnitude guidelines, it is recommended to use stricter criteria for circular kappas that tend to produce higher values.  相似文献   

This study evaluates performance of information criteria used to separate latent classes. In the evaluations, various numbers of latent classes, sample sizes, parameter structures and latent-class complexities were designed to simulate datasets. The average accuracy rates of information criteria in selecting the designed numbers of latent classes were the core results in this experiment. The study revealed that widely used information criteria, e.g., AIC, BIC, CAIC, could perform poorly under some circumstances. By including a sample size adjustment (Rissanen, 1978), the unsatis-factory performances could be improved considerably. The sample size adjustment provides a plausible solution for separating latent classes. Guidelines are provided to help achieve optimum use of the model fit indices.  相似文献   

“烧结”在矿物工程和粉末冶金两个学科领域都是核心名词,尽管两类烧结都是通过高温改变物料的凝聚状态,但已分别被赋予了互不相同的内涵,存在着一系列本质上的差别,不能相互取代。在汇编冶金学基本名词时,两个“烧结”应该分别列出,一个也不能少。  相似文献   

本文从道德的第一重作用:“维护人的利益”出发,认为尊重自然界及其中物种和其他组分的权利就是维护生态平衡。物种、生态系及地球生物圈其他组分的内在价值也就是对系统稳定性的价值。利益也就是物种的需要、习性。从道德的第二重作用:“满足人精神需求和实现自身价值”出发,认为自然界及其中的生态系统、物种、动植物个体的权利、利益和价值就是热爱生命、满足精神需求和实现自身价值的依据。并针对非人类中心主义环境伦理学把“是”等同于“应该”这一难点问题,认为这些流派提出的权利、利益、价值的依据就是寻求动植物个体、物种、生态系统、自然界中与人相似的特征把他们当做人来爱护。  相似文献   

技术归化是20世纪90年代从文化与传媒研究领域引入技术论研究的新概念,它把握住了用户在技术消费使用过程中的能动性,为理解用户与产品关系提供了关键的视角。技术归化是指将技术产品融入到应用环境,使其成为用户所处实践与文化网络一部分的过程。技术归化是一个学习与赋予意义的过程,正是通过技术归化,技术产品的社会文化价值才得以形成。对一项技术进行归化的集体过程既可能导致技术根深蒂固,难以排除,也可能使地位确立的技术"祛稳定化",后者提供了一种打破技术锁定,摆脱路径依赖的可能途径。  相似文献   

“烧结”在矿物工程和粉末冶金两个学科领域都是核心名词,尽管两类烧结都是通过高温改变物料的凝聚状态,但已分别被赋予了互不相同的内涵,存在着一系列本质上的差别,不能相互取代。在汇编冶金学基本名词时,两个“烧结”应该分别列出,一个也不能少。  相似文献   

k-Adic formulations (for groups of objects of size k) of a variety of 2-adic similarity coefficients (for pairs of objects) for binary (presence/absence) data are presented. The formulations are not functions of 2-adic similarity coefficients. Instead, the main objective of the the paper is to present k-adic formulations that reflect certain basic characteristics of, and have a similar interpretation as, their 2-adic versions. Two major classes are distinguished. The first class is referred to as Bennani-Heiser similarity coefficients, which contains all coefficients that can be defined using just the matches, the number of attributes that are present and that are absent in k objects, and the total number of attributes. The coefficients in the second class can be formulated as functions of Dice’s association indices. The author thanks Willem Heiser and three anonymous reviewers for their helpful comments and valuable suggestions on earlier versions of this article.  相似文献   

We introduce new similarity measures between two subjects, with reference to variables with multiple categories. In contrast to traditionally used similarity indices, they also take into account the frequency of the categories of each attribute in the sample. This feature is useful when dealing with rare categories, since it makes sense to differently evaluate the pairwise presence of a rare category from the pairwise presence of a widespread one. A weighting criterion for each category derived from Shannon??s information theory is suggested. There are two versions of the weighted index: one for independent categorical variables and one for dependent variables. The suitability of the proposed indices is shown in this paper using both simulated and real world data sets.  相似文献   

人文思想在自然科学发展中重要作用研究   总被引:1,自引:0,他引:1  
自然科学和人文学科的发展在本质上是相互促进的,人文学科中的哲学、美学、宗教、伦理学和文学艺术等都以不同方式、不同程度对自然科学起促进作用。这至少表现在四个方面:人文学科的许多基本原理成为自然科学的理论基础;人文学科的思想和意境是科学发现的灵感和动力;人文学科的许多思雏方法成为自然科学的研究方法;人文学科的许多概念和表述方法被自然科学使用。我们应该有意识促进自然科学和人文学科之间的大融合。才能创造出更加灿烂辉煌的文化。  相似文献   

gene trees and species trees. We construct different lineage histories for different genes, in spite of the fact that intragenic recombination ensures that building a gene tree can become an exercise in averaging over disparate (and reticulating) segmental phylogenies. Combining data across disparate gene trees leads to an average species tree, but whether that represents anything real is dubious. Another ploy is to study mitochondrial and/or chloroplast genomes, confidently asserted to be inherited in strictly lineal fashion, without recombination. Evidence is mounting, however, that even these organellar elements have recombination and that their phylogenies are reticulate. Given the generally reticulate process of evolution at the subspecific level, we should model the collection of relationships more as a redundant and multiply connected network than as a strictly radiating phylogeny.  相似文献   

In this paper, the potentialities of transvariation (Gini, 1959) in measuring the separation between two groups of multivariate observations are explored. With this aim, a modified version of Gini’s notion of multidimensional transvariation is proposed. According to Gini (1959), two groups G1 and G2 are said to transvary on the k-dimensional variable X = (X1,...,Xh,...,Xk) if there exists at least one pair of units, belonging to different groups, such that for h = 1,...,k the sign of the difference between their Xh values is opposite to that of m1h −m2h, where m1h and m2h are the corresponding group mean values of Xh. We introduce a modification that allows us to derive a measure of group separation, which can be profitably used in discriminating between two groups. The performance of the measure is tested through simulation experiments. The results show that the proposed measure is not sensitive to distributional assumptions and highlight its robustness against outliers.  相似文献   

九服晷影算法从一行《大衍历》(公元724元)起,成为唐宋金元历法计算的组成部分。通过对这一时期九服晷影算法的系统解读,说明该算法的造术方法及特征。一行《大衍历》中采用数表算法,边冈《崇玄历》(公元892年)及其后历法均采用公式算法。王朴在《钦天历》(公元956年)中利用一行开元年间大地子午线测量结果,构造了独特的九服晷影函数,由此可推导出中国数理天文学史上第一个正切函数表达式。  相似文献   

Explaining the complex dynamics exhibited in many biological mechanisms requires extending the recent philosophical treatment of mechanisms that emphasizes sequences of operations. To understand how nonsequentially organized mechanisms will behave, scientists often advance what we call dynamic mechanistic explanations. These begin with a decomposition of the mechanism into component parts and operations, using a variety of laboratory-based strategies. Crucially, the mechanism is then recomposed by means of computational models in which variables or terms in differential equations correspond to properties of its parts and operations. We provide two illustrations drawn from research on circadian rhythms. Once biologists identified some of the components of the molecular mechanism thought to be responsible for circadian rhythms, computational models were used to determine whether the proposed mechanisms could generate sustained oscillations. Modeling has become even more important as researchers have recognized that the oscillations generated in individual neurons are synchronized within networks; we describe models being employed to assess how different possible network architectures could produce the observed synchronized activity.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号