共查询到20条相似文献,搜索用时 15 毫秒
1.
The problem of measuring the impact of individual data points in a cluster analysis is examined. The purpose is to identify those data points that have an influence on the resulting cluster partitions. Influence of a single data point is considered present when different cluster partitions result from the removal of the element from the data set. The Hubert and Arabie (1985) corrected Rand index was used to provide numerical measures of influence of a data point. Simulated data sets consisting of a variety of cluster structures and error conditions were generated to validate the influence measures. The results showed that the measure of internal influence was 100% accurate in identifying those data elements exhibiting an influential effect. The nature of the influence, whether beneficial or detrimental to the clustering, can be evaluated with the use of the gamma and point-biserial statistics. 相似文献
2.
Weighting and selection of variables for cluster analysis 总被引:1,自引:0,他引:1
One of the thorniest aspects of cluster analysis continues to be the weighting and selection of variables. This paper reports on the performance of nine methods on eight leading case simulated and real sets of data. The results demonstrate shortcomings of weighting based on the standard deviation or range as well as other more complex schemes in the literature. Weighting schemes based upon carefully chosen estimates of within-cluster and between-cluster variability are generally more effective. These estimates do not require knowledge of the cluster structure. Additional research is essential: worry-free approaches do not yet exist. 相似文献
3.
Jacqueline J. Meulman 《Journal of Classification》1996,13(2):249-266
An approach is presented for analyzing a heterogeneous set of categorical variables assumed to form a limited number of homogeneous subsets. The variables generate a particular set of proximities between the objects in the data matrix, and the objective of the analysis is to represent the objects in lowdimensional Euclidean spaces, where the distances approximate these proximities. A least squares loss function is minimized that involves three major components: a) the partitioning of the heterogeneous variables into homogeneous subsets; b) the optimal quantification of the categories of the variables, and c) the representation of the objects through multiple multidimensional scaling tasks performed simultaneously. An important aspect from an algorithmic point of view is in the use of majorization. The use of the procedure is demonstrated by a typical example of possible application, i.e., the analysis of categorical data obtained in a free-sort task. The results of points of view analysis are contrasted with a standard homogeneity analysis, and the stability is studied through a Jackknife analysis. 相似文献
4.
This paper develops a new procedure for simultaneously performing multidimensional scaling and cluster analysis on two-way
compositional data of proportions. The objective of the proposed procedure is to delineate patterns of variability in compositions
across subjects by simultaneously clustering subjects into latent classes or groups and estimating a joint space of stimulus
coordinates and class-specific vectors in a multidimensional space. We use a conditional mixture, maximum likelihood framework
with an E-M algorithm for parameter estimation. The proposed procedure is illustrated using a compositional data set reflecting
proportions of viewing time across television networks for an area sample of households. 相似文献
5.
The standard procedure in numerical classification and identification of micro-organisms based on binary features is given a justification based on the principle of maximum entropy. This principle also strongly supports the assumption that all characteristics upon which the classification is based are equally important and the use of polythetic taxa. The relevance of the principle of maximum entropy in connection with taxonomic structures based on clustering and maximal predictivity is discussed. A result on asymptotic separateness of maximum entropy distributions has implications for minimizing identification errors.The work was partially supported by the Bank of Sweden Tercentenary Foundation, The Swedish Council for Forestry and Agricultural Research, The Carl Trygger Foundation, and the Swedish Cancer Foundation. 相似文献
6.
A general set of multidimensional unfolding models and algorithms is presented to analyze preference or dominance data. This class of models termed GENFOLD2 (GENeral UnFOLDing Analysis-Version 2) allows one to perform internal or external analysis, constrained or unconstrained analysis, conditional or unconditional analysis, metric or nonmetric analysis, while providing the flexibility of specifying and/or testing a variety of different types of unfolding-type preference models mentioned in the literature including Caroll's (1972, 1980) simple, weighted, and general unfolding analysis. An alternating weighted least-squares algorithm is utilized and discussed in terms of preventing degenerate solutions in the estimation of the specified parameters. Finally, two applications of this new method are discussed concerning preference data for ten brands of pain relievers and twelve models of residential communication devices. 相似文献
7.
8.
在西方,专家们对“自主学习”这一概念应使用的术语争论不已。文章从纵向角度探析了英语中关于“自主学习”的不同术语及其含义,剖析了不同术语之间的关系,比较了中西文化中“自主学习”的不同内涵,并探讨了正确理解“自主学习”含义的重大意义。 相似文献
9.
10.
针灸铜人是与中国针灸疗法相关联的实用模型与文化载体,最晚在公元14世纪已传至日本,在江户时代,日本自制针灸铜人呈现出蓬勃发展的趋势.本文扼要梳理日本江户时代的针灸业发展特点,论证其流派授学、执业宣传等对针灸铜人的市场需求;在客观层面,说明纸质铜人在江户时代备受青睐这一特殊文化现象的现实原因;在主观层面,结合江户造纸业的... 相似文献
11.
动态能力的技术内涵解析 总被引:1,自引:0,他引:1
本文首先对资源学派、核心能力学派的主要观点作了介绍,指出这两大学派存在的问题和不足,然后就动态能力的主要观点作了介绍,指出在多变的技术环境下,能力的培育和成长是一个动态过程,动态能力理论的基础就是一个能力随技术范式、技术轨道迁移过程中不断培育、更新和技术学习的过程.文中对动态能力的技术内涵作了解析,针对如何建立动态能力,提出应加强企业的技术学习. 相似文献
12.
社会学视野中的网络犯罪与综合治理 总被引:1,自引:0,他引:1
从网络社会学的角度看,网络犯罪包括黑客、蛀虫、黄潮和逆流等类型。从网络犯罪所造成的社会后果看,它不仅妨碍了网络行为者的正常的社会生活秩序,而且也对整个网络社会生活造成了较大影响。解决或消除网络犯罪必须通过整个网络社会采用多种手段,进行综舍治理。 相似文献
13.
14.
信息崇拜是由科学崇拜、技术崇拜、金钱崇拜和黑客崇拜转化而来的崇拜形式.打破盲目的信息崇拜,适应信息时代的生存,就要正确认识信息与知识的关系,恰当定位信息的功能,培养人的信息素养,高扬人的主体地位. 相似文献
15.
We consider two fundamental properties in the analysis of two-way tables of positive data: the principle of distributional
equivalence, one of the cornerstones of correspondence analysis of contingency tables, and the principle of subcompositional
coherence, which forms the basis of compositional data analysis. For an analysis to be subcompositionally coherent, it suffices
to analyze the ratios of the data values. A common approach to dimension reduction in compositional data analysis is to perform
principal component analysis on the logarithms of ratios, but this method does not obey the principle of distributional equivalence.
We show that by introducing weights for the rows and columns, the method achieves this desirable property and can be applied
to a wider class of methods. This weighted log-ratio analysis is theoretically equivalent to “spectral mapping”, a multivariate
method developed almost 30 years ago for displaying ratio-scale data from biological activity spectra. The close relationship
between spectral mapping and correspondence analysis is also explained, as well as their connection with association modeling.
The weighted log-ratio methodology is used here to visualize frequency data in linguistics and chemical compositional data
in archeology.
The first author acknowledges research support from the Fundación BBVA in Madrid as well as partial support by the Spanish
Ministry of Education and Science, grant MEC-SEJ2006-14098. The constructive comments of the referees, who also brought additional
relevant literature to our attention, significantly improved our article. 相似文献
16.
A methodological problem in applied clustering involves the decision of whether or not to standardize the input variables prior to the computation of a Euclidean distance dissimilarity measure. Existing results have been mixed with some studies recommending standardization and others suggesting that it may not be desirable. The existence of numerous approaches to standardization complicates the decision process. The present simulation study examined the standardization problem. A variety of data structures were generated which varied the intercluster spacing and the scales for the variables. The data sets were examined in four different types of error environments. These involved error free data, error perturbed distances, inclusion of outliers, and the addition of random noise dimensions. Recovery of true cluster structure as found by four clustering methods was measured at the correct partition level and at reduced levels of coverage. Results for eight standardization strategies are presented. It was found that those approaches which standardize by division by the range of the variable gave consistently superior recovery of the underlying cluster structure. The result held over different error conditions, separation distances, clustering methods, and coverage levels. The traditionalz-score transformation was found to be less effective in several situations. 相似文献
17.
由建构与算理看戴震的《勾股割圜记》 总被引:1,自引:0,他引:1
戴震对中算的贡献,一向被认为只局限于算书的整理上,如校《九章算术》、恢复《算经十书》,等等。他的数学成就,常被批评为无足轻重。文章从学术建构的角度,来分析戴震的《勾股割圜记》,认为戴震著书的目的是希望由中算固有的性质中,创造出一个具文化传承且能与西方三角学匹敌的勾股割圆术。由此观点,戴书中许多被诟病之处,可得到合理的解释。另外,在研究戴震算学的文献中,很少谈到戴震对算理的强调。文章的另一目的是讨论算理在《勾股割圜记》中的重要性。戴震宣称割圜之法尽于勾股互权(相似直角三角形三边互求)。而在其书中勾股术的推导过程,除了图式,戴更明确地列出相似勾股形的对应边。无论是平面或是球面,戴的确是用该性质推导出所有的勾股术。 相似文献
18.
我国网络科技信息资源开发中的问题及对策思考 总被引:1,自引:0,他引:1
随着网络技术的发展,大量的网络科技信息成为人们获取科技信息的主要来源。但同时也出现一些问题,比如信息质量差、信息分布不均衡、原创科技信息不足、个性化信息资源检索功能不健全、信息重复和信息污染情况严重等。为此,本文就如何更好地进行我国网络科技信息资源建设,提出几点建议,以供参考。 相似文献
19.
指南针产生以后,中国学者对指南针之所以能够指南,从理论上做过探讨.这些探讨大都是从阴阳五行学说出发,结合当时人们对大地形状的认识而展开的.万历年间,传教士来华,带来了西方的指南针理论、地球学说以及相关的科技知识,在这些知识的影响下,中国学者开始从新的视角探讨指南针理论问题.在这些探讨中,阴阳五行的作用淡化了,而从力学角度做的分析却增加了,这是前所未有的.在传教士中,南怀仁的指南针理论最为系统,但他的理论仍然局限在古代科学的范围,并非吉尔伯特的磁学理论.南怀仁的理论在中国影响深远,直到19世纪中叶,仍有中国学者用南怀仁理论解释指南针问题. 相似文献
20.
随着网络经济、市场经济的迅猛发展,传统的高校科技成果转化中介服务手段与先进的网络信息技术已形成了巨大反差,建设高校科技成果转化的信息化中介服务体系已具备条件.本文在分析国内网络业的现状、特征和发展趋势的基础上,提出一种基于网络高校科技成果转化系统的新思想,并进行了系统分析与设计. 相似文献