首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The set of k points that optimally represent a distribution in terms of mean squared error have been called principal points (Flury 1990). Principal points are a special case of self-consistent points. Any given set of k distinct points in R p induce a partition of R p into Voronoi regions or domains of attraction according to minimal distance. A set of k points are called self-consistent for a distribution if each point equals the conditional mean of the distribution over its respective Voronoi region. For symmetric multivariate distributions, sets of self-consistent points typically form symmetric patterns. This paper investigates the optimality of different symmetric patterns of self-consistent points for symmetric multivariate distributions and in particular for the bivariate normal distribution. These results are applied to the problem of estimating principal points.  相似文献   

2.
Clustering of multivariate spatial-time series should consider: 1) the spatial nature of the objects to be clustered; 2) the characteristics of the feature space, namely the space of multivariate time trajectories; 3) the uncertainty associated to the assignment of a spatial unit to a given cluster on the basis of the above complex features. The last aspect is dealt with by using the Fuzzy C-Means objective function, based on appropriate measures of dissimilarity between time trajectories, by distinguishing the cross-sectional and longitudinal aspects of the trajectories. In order to take into account the spatial nature of the statistical units, a spatial penalization term is added to the above function, depending on a suitable spatial proximity/ contiguity matrix. A tuning coefficient takes care of the balance between, on one side, discriminating according to the pattern of the time trajectories and, on the other side, ensuring an approximate spatial homogeneity of the clusters. A technique for determining an optimal value of this coefficient is proposed, based on an appropriate spatial autocorrelation measure. Finally, the proposed models are applied to the classification of the Italian provinces, on the basis of the observed dynamics of some socio-economical indicators.  相似文献   

3.
现代科学技术引发的伦理问题日益凸显,现代科技的伦理问题要受相应制度的约束,而这种约束主要由自由的受限性、自律的不确定性及制度的强制性等多重因素所决定。针对现代科技伦理制度的定位与缺失,需要选择和制定科学的制度,以此约束科学技术发展带来的伦理问题,使科技能更好地造福人类。  相似文献   

4.
As data sets continue to grow in size and complexity, effective and efficient techniques are needed to target important features in the variable space. Many of the variable selection techniques that are commonly used alongside clustering algorithms are based upon determining the best variable subspace according to model fitting in a stepwise manner. These techniques are often computationally intensive and can require extended periods of time to run; in fact, some are prohibitively computationally expensive for high-dimensional data. In this paper, a novel variable selection technique is introduced for use in clustering and classification analyses that is both intuitive and computationally efficient. We focus largely on applications in mixture model-based learning, but the technique could be adapted for use with various other clustering/classification methods. Our approach is illustrated on both simulated and real data, highlighted by contrasting its performance with that of other comparable variable selection techniques on the real data sets.  相似文献   

5.
6.
7.
Data holders, such as statistical institutions and financial organizations, have a very serious and demanding task when producing data for official and public use. It’s about controlling the risk of identity disclosure and protecting sensitive information when they communicate data-sets among themselves, to governmental agencies and to the public. One of the techniques applied is that of micro-aggregation. In a Bayesian setting, micro-aggregation can be viewed as the optimal partitioning of the original data-set based on the minimization of an appropriate measure of discrepancy, or distance, between two posterior distributions, one of which is conditional on the original data-set and the other conditional on the aggregated data-set. Assuming d-variate normal data-sets and using several measures of discrepancy, it is shown that the asymptotically optimal equal probability m-partition of , with m 1/d ∈ , is the convex one which is provided by hypercubes whose sides are formed by hyperplanes perpendicular to the canonical axes, no matter which discrepancy measure has been used. On the basis of the above result, a method that produces a sub-optimal partition with a very small computational cost is presented. Published online xx, xx, xxxx.  相似文献   

8.
9.
We discuss the use of orthogonal wavelet transforms in preprocessing multivariate data for subsequent analysis, e.g., by clustering the dimensionality reduction. Wavelet transforms allow us to introduce multiresolution approximation, and multiscale nonparametric regression or smoothing, in a natural and integrated way into the data analysis. As will be explained in the first part of the paper, this approach is of greatest interest for multivariate data analysis when we use (i) datasets with ordered variables, e.g., time series, and (ii) object dimensionalities which are not too small, e.g., 16 and upwards. In the second part of the paper, a different type of wavelet decomposition is used. Applications illustrate the powerfulness of this new perspective on data analysis.  相似文献   

10.
The paper contains a proposal of interval data clustering related to given social and economic objects characterized by many interval variables. This multivariate approach is based on an original conception of interval quantiles constructed using a special definition derived from the notion of the Hausdorff distance. In order to improve the quality of classification, the obtained interval quantile classes can be next aggregated into larger merged classes. The efficiency of our method can be assessed using especially defined indices of entropy and volume coefficients. The second notion replaces the classical concept of area, which is not applicable in this case.  相似文献   

11.
关于逻辑和逻辑现代化的几个问题——评唯演绎主义   总被引:1,自引:0,他引:1  
我国的五次逻辑论争表明逻辑学科本身还有不够规范和完善之处,其局限性和实效性之薄弱又难以适应当代高新科技的发展和文化的需求。逻辑学要改进、变革和发展。若仍以唯演绎主义看待逻辑和逻辑现代化则是不合时宜的。  相似文献   

12.
论社会选择与自然选择之张力   总被引:3,自引:2,他引:3  
社会选择与自然选择分别作为社会与自然界的内在规定,因后者的对象性关系而生成相互规约或“牵引”的张力。社会选择与自然选择张力的非和谐本性,在时空维直接表达为强弱地位的交替与置换,并以社会形态为中介,充分展示在历史发展的逻辑中。在现实性上,社会进化最终只能赖于社会选择,否则将出现“社会返祖”。社会选择的应当取向不是“社会与自然的协调”,而是社会与自然的无限“磨合”。  相似文献   

13.
收词是名词审定工作中的一个重要环节。一般情况下,在学科框架体系下进行收词,所选择的文献资料当具有科学性、代表性和权威性;分批分级地收词,重视基本名词和新词的收选;所收名词汇总后,根据概念体系进行一定的编排和加工整理,以保证系统的平衡性、系统性和完备性;此外,还需进行查重。  相似文献   

14.
Mokken scale analysis uses an automated bottom-up stepwise item selection procedure that suffers from two problems. First, when selected during the procedure items satisfy the scaling conditions but they may fail to do so after the scale has been completed. Second, the procedure is approximate and thus may not produce the optimal item partitioning. This study investigates a variation on Mokken’s item selection procedure, which alleviates the first problem, and proposes a genetic algorithm, which alleviates both problems. The genetic algorithm is an approximation to checking all possible partitionings. A simulation study shows that the genetic algorithm leads to better scaling results than the other two procedures.  相似文献   

15.
决定论的进化观无论作为实在性还是解释性上的理论需要,一直占据着主导地位。自从围绕分子水平进化机制讨论的兴起,便出现了关于进化非决定论的争论,并且很快成为关于实在论的争论。布兰登等基于解释性渗透观念对进化非决定论所进行的辩护在这场争论中受到诸多责难,因此这里将从另外一种角度探讨非决定性要素的渗透,进而为进化非决定论观点提供一种实在论的基础。  相似文献   

16.
本文从分析达尔文自然选择与现实自然界生物多样性的矛盾,及与中性学说之争论出发,提出自然选择的实质是“最劣必汰,差别保存”的新图景。同时,对生物进化中的必然性和偶然性进行探讨。  相似文献   

17.
Sometimes a larger dataset needs to be reduced to just a few points, and it is desirable that these points be representative of the whole dataset. If the future uses of these points are not fully specified in advance, standard decision-theoretic approaches will not work. We present here methodology for choosing a small representative sample based on a mixture modeling approach.  相似文献   

18.
We describe a novel extension to the Class-Cover-Catch-Digraph (CCCD) classifier, specifically tuned to detection problems. These are two-class classification problems where the natural priors on the classes are skewed by several orders of magnitude. The emphasis of the proposed techniques is in computationally efficient classification for real-time applications. Our principal contribution consists of two boosted classi- fiers built upon the CCCD structure, one in the form of a sequential decision process and the other in the form of a tree. Both of these classifiers achieve performances comparable to that of the original CCCD classifiers, but at drastically reduced computational expense. An analysis of classification performance and computational cost is performed using data from a face detection application. Comparisons are provided with Support Vector Machines (SVM) and reduced SVMs. These comparisons show that while some SVMs may achieve higher classification performance, their computational burden can be so high as to make them unusable in real-time applications. On the other hand, the proposed classifiers combine high detection performance with extremely fast classification.  相似文献   

19.
20.
Based on the notion of mutual information between the components of a random vector, we construct, for data reduction reasons, an optimal quantization of the support of its probability measure. More precisely, we propose a simultaneous discretization of the whole set of the components of the random vector which takes into account, as much as possible, the stochastic dependence between them. Examples are presented.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号