共查询到20条相似文献,搜索用时 15 毫秒
1.
This paper proposes a maximum clustering similarity (MCS) method for determining the number of clusters in a data set by studying
the behavior of similarity indices comparing two (of several) clustering methods. The similarity between the two clusterings
is calculated at the same number of clusters, using the indices of Rand (R), Fowlkes and Mallows (FM), and Kulczynski (K)
each corrected for chance agreement. The number of clusters at which the index attains its maximum is a candidate for the
optimal number of clusters. The proposed method is applied to simulated bivariate normal data, and further extended for use
in circular data. Its performance is compared to the criteria discussed in Tibshirani, Walther, and Hastie (2001). The proposed
method is not based on any distributional or data assumption which makes it widely applicable to any type of data that can
be clustered using at least two clustering algorithms. 相似文献
2.
Marek Ancukiewicz 《Journal of Classification》1998,15(1):129-141
I consider a new problem of classification into n(n ≥ 2) disjoint classes based on features of unclassified data. It is assumed that the data are grouped into m(M ≥ n) disjoint sets and within each set the distribution of features is a mixture of distributions corresponding to particular
classes. Moreover, the mixing proportions should be known and form a matrix of rank n. The idea of solution is, first, to estimate feature densities in all the groups, then to solve the linear system for component
densities. The proposed classification method is asymptotically optimal, provided a consistent method of density estimation
is used. For illustration, the method is applied to determining perfusion status in myocardial infarction patients, using
creatine kinase measurements. 相似文献
3.
The issue of determining “the right number of clusters” in K-Means has attracted considerable interest, especially in the
recent years. Cluster intermix appears to be a factor most affecting the clustering results. This paper proposes an experimental
setting for comparison of different approaches at data generated from Gaussian clusters with the controlled parameters of
between- and within-cluster spread to model cluster intermix. The setting allows for evaluating the centroid recovery on par
with conventional evaluation of the cluster recovery. The subjects of our interest are two versions of the “intelligent” K-Means method, ik-Means, that find the “right” number of clusters by extracting “anomalous patterns” from the data one-by-one. We compare them
with seven other methods, including Hartigan’s rule, averaged Silhouette width and Gap statistic, under different between-
and within-cluster spread-shape conditions. There are several consistent patterns in the results of our experiments, such
as that the right K is reproduced best by Hartigan’s rule – but not clusters or their centroids. This leads us to propose an adjusted version
of iK-Means, which performs well in the current experiment setting. 相似文献
4.
Christian Hennig 《Journal of Classification》2002,19(2):249-276
In this paper an
algorithm is developed, which aims to find all FPCs of a dataset corresponding
to well separated linear regression subpopulations. Its ability to find such
subpopulations under the occurence of outliers is compared to methods based on
ML-estimation of mixture models by means of a simulation study. Furthermore,
FPC analysis is applied to a real dataset. 相似文献
5.
6.
This paper presents the development of a new methodology which simultaneously estimates in a least-squares fashion both an ultrametric tree and respective variable weightings for profile data that have been converted into (weighted) Euclidean distances. We first review the relevant classification literature on this topic. The new methodology is presented including the alternating least-squares algorithm used to estimate the parameters. The method is applied to a synthetic data set with known structure as a test of its operation. An application of this new methodology to ethnic group rating data is also discussed. Finally, extensions of the procedure to model additive, multiple, and three-way trees are mentioned.The first author is supported as Bevoegdverklaard Navorser of the Belgian Nationaal Fonds voor Wetenschappelijk Onderzoek. 相似文献
7.
隐性知识对技术创新的重要性已受到国内外学术界和企业界的广泛关注.本文通过对国外学者有关隐性知识的作用和来源等研究成果的回顾和总结,指出企业要提高技术创新能力,隐性知识的生产、转移和使用必不可少. 相似文献
8.
9.
10.
《自然辩证法研究》2019,(9):49-54
脉冲星是天文学史上的重大发现,开启了射电天文学的新纪元。从科学与技术相互关系的角度,考察其发现过程。本文认为有两方面因素在起作用,一是科学的传统,即20世纪30年代以来天体物理学思想和理论的传统;一是新的观测技术,即射电天文观测方法与仪器的发展。这两个因素犹如发现脉冲星的两只"眼睛"。就科学传统而言,脉冲星的发现与行星际闪烁的测量、类星体的发现、恒星演化理论的中子星假说等一系列天体物理学理论一脉相承。就观测技术来说,脉冲星的发现与射电望远镜行星际闪烁阵列的观测方法和技术创新密不可分,新技术为天文学带来了新发现。脉冲星的发现,见证了近现代天文学是如何在科学与技术的相互交织中不断向前发展的。 相似文献
11.
盈不足方法是古代数学中一项具有一般性的重要方法。通过将《算数书》和《九章算术》及其他文献结合起来,探讨它在中国上古时代的形成与流传。认为:先秦时期实际工作中经常出现某种东西过多或过少的情况,为了得到合适的数量,人们通过运用比、比例和分数的知识,找到了解决问题的盈不足方法。这种方法及其应用形成了先秦数学的一个科目——盈不足,并记载于《九章算术》在先秦的祖本中。受它直接或间接的影响,先秦到汉代的学者们根据需要设置了很多盈不足问题。《算数书》中的盈不足问题即由此而来。 相似文献
12.
Andrew R. Webb 《Journal of Classification》1997,14(2):249-267
This paper considers the use of radial basis functions for exploratory data analysis. These are used to model a transformation
from a high-dimensional observation space to a low-dimensional one. The parameters of the model are determined by optimising
a loss function defined to be the stress function in multidimensional scaling. The metric for the low-dimensional space is
taken to be the Minkowski metric with order parameter 1<-p<-2. A scheme based on iterative majorisation is proposed. 相似文献
13.
核仁小分子RNA(small nucleolar RNA,snoR-NA)是真核生物细胞核内的一大类非编码RNA。按其结构和功能特点,除了MRP RNA,目前已知的sn-oRNA可分为两大类:box C/D和box H/ACA。除了少数snoRNA在rRNA前体的剪切加工和转录后修饰过程中起到重要的作用,大部分snoRNA都是通过反义互补行使功能。box C/D snoRNA指导rRNA或snRNA特定位点的2’-O-甲基化修饰,而box H/ACAsnoRNA指导rRNA或snRNA上的假尿嘧啶化修饰。根据snoRNA是否与rRNAs或snRNAs反义互补,也可将它们分为向导snoRNAs(guide snoRNA)和孤儿snoRNAs(orph… 相似文献
14.
Eric B. Dent 《Foundations of Science》2003,8(3):295-314
It is time that we in organization sciencesdevelop and implement a new mental model forcause and effect relationships. The dominantmodel in research dates at least to the 1700sand no longer serves the full purposes of thesocial science research problems of the21st century. Traditionally, research is``essentially concerned with two-variableproblems, linear causal trains, one cause andone effect, or with few variables at the most'(von Bertalanffy, 1968, p. 12). However, theliterature is replete with examples ofphenomena in which the traditional cause andeffect construct does not allow for greaterunderstanding and insight into the phenomena. Different conceptions of cause and effectrelationships have been developed includingproducer/product relationships (Ackoff 1981),design causality (Argyris and Schon, 1996), andfour classes of causal models (Schwartz andOgilvy, 1979). Of interest here is thepossibility of mutual causality, ``theassumption that the relationship between two(or more) phenomena is heavily influenced bythe presence of feedback loops that areinstantaneous, or nearly so' (Dent, 1999). Maturana's (1998, Maturana and Varela, 1987)work on a new epistemology and ontologyprovides a foundation for the alternative modelof cause and effect proposed here. Thisinteraction model includes the dynamics of thetraditional X and Y, but adds the structure ofX (A), the structure of Y (B), the environment(E), and time (T). 相似文献
15.
惠勒是一位关注物理学基础和实在本性的物理学家。他继承了哥本哈根学派的传统,并把哥本哈根学派的思想推到了极致。他一生提出了很多具有哲学意味的物理学命题,如延迟选择试验、参与的宇宙等。本文着重介绍他在量子力学和广义相对论交叉处的一些物理命题,如黑洞、真子、量子泡沫等。 相似文献
16.
文章就辛德勇先生《释“白田”》一文提出商榷,认为白田只是旱田中没有人工灌溉的农田,不是旱田的同义词;另外,空白没有种上庄稼的农田也可以称为白田.同样,唐宋以前的水田也不是单指水稻田,而是指水利田,包括能够得到人工灌溉的旱地.白田和水田的分野是北方水利灌溉事业发展的结果,而与南方的开发无关. 相似文献
17.
《吕氏春秋》记载的黄钟律管长度为三寸九分,而汉代后,古人普遍采纳黄钟九寸。为了研究先秦黄钟律管真实长度,对文献进行系统地考证,发现《吕氏春秋》记载的三寸九分黄钟律管应为清黄钟律管,发音当是九寸管的高八度。文献研究亦表明了先秦古人听声定律的事实,即三寸九分律管和九寸律管都为听声定律产物。由此,根据声学公式,结合文献考证,计算了黄钟律管的管径,结论为:先秦律管应是竹管,管径范围约是0.71寸。同时,考证了黄钟律管八寸十分一说和八寸七分一说,认为这两种说法都不可靠。 相似文献
18.
区域创新网络在高技术产业发展中的作用——关于硅谷创新的一种诠释 总被引:12,自引:0,他引:12
技术创新大量地以创新型小企业创业的形式来实现 ,是高技术产业形成期和发展期技术创新的一个重要特点 ,也是硅谷式创新的主要表现形式之一。本文通过对技术创新的实质———把技术创新看成一种“创造性的建构”过程的分析 ,对创新型企业在创业过程中创新主体角色转移的分析 ,以及对小企业自身的优势和劣势的分析 ,阐述了区域创新网络在高技术产业发展中的重要的作用。同时也对硅谷的技术创新作出了一种新的诠释。 相似文献
19.
Unfolding creates configurations from preference information. In this paper, it is argued that not all preference information needs to be collected and that good solutions are still obtained, even when more than half of the data is missing. Simulation studies are conducted to compare missing data treatments, sources of missing data, and designs for the specification of missing data. Guidelines are provided and used in actual practice. 相似文献
20.
本文以位于浙江省杭州市的一家铅酸电池制造企业为案例,抓住制度压力来源主体,识别影响企业生态创新的关键利益相关者以及他们影响企业生态创新的内在机制,主要结论如下:第一,政府环保导向、客户环保导向、竞争者环保导向、高管环保意识是影响企业生态创新四大主要因素,影响着生态创新的不同维度;第二,政府环保导向、客户环保导向影响企业生态创新的内在机制主要包括规制、规范惩罚以及资源利诱,竞争者环保导向影响企业生态创新的内在机制主要是规制、规范合法性竞赛以及资源争夺竞赛,高管环保意识影响企业生态创新的内在机制则包括道德和利益驱动两个方面。 相似文献