共查询到20条相似文献,搜索用时 15 毫秒
1.
Thaddeus Tarpey 《Journal of Classification》1998,15(1):57-79
The set of k points that optimally represent a distribution in terms of mean squared error have been called principal points (Flury 1990). Principal points are a special case of self-consistent points. Any given set of k distinct points in R p induce a partition of R p into Voronoi regions or domains of attraction according to minimal distance. A set of k points are called self-consistent for a distribution if each point equals the conditional mean of the distribution over its respective Voronoi region. For symmetric multivariate distributions, sets of self-consistent points typically form symmetric patterns. This paper investigates the optimality of different symmetric patterns of self-consistent points for symmetric multivariate distributions and in particular for the bivariate normal distribution. These results are applied to the problem of estimating principal points. 相似文献
2.
Clustering of multivariate spatial-time series should consider: 1) the spatial nature of the objects to be clustered; 2) the characteristics of the feature space, namely the space of multivariate time trajectories; 3) the uncertainty associated to the assignment of a spatial unit to a given cluster on the basis of the above complex features. The last aspect is dealt with by using the Fuzzy C-Means objective function, based on appropriate measures of dissimilarity between time trajectories, by distinguishing the cross-sectional and longitudinal aspects of the trajectories. In order to take into account the spatial nature of the statistical units, a spatial penalization term is added to the above function, depending on a suitable spatial proximity/ contiguity matrix. A tuning coefficient takes care of the balance between, on one side, discriminating according to the pattern of the time trajectories and, on the other side, ensuring an approximate spatial homogeneity of the clusters. A technique for determining an optimal value of this coefficient is proposed, based on an appropriate spatial autocorrelation measure. Finally, the proposed models are applied to the classification of the Italian provinces, on the basis of the observed dynamics of some socio-economical indicators. 相似文献
3.
现代科学技术引发的伦理问题日益凸显,现代科技的伦理问题要受相应制度的约束,而这种约束主要由自由的受限性、自律的不确定性及制度的强制性等多重因素所决定。针对现代科技伦理制度的定位与缺失,需要选择和制定科学的制度,以此约束科学技术发展带来的伦理问题,使科技能更好地造福人类。 相似文献
4.
As data sets continue to grow in size and complexity, effective and efficient techniques are needed to target important features in the variable space. Many of the variable selection techniques that are commonly used alongside clustering algorithms are based upon determining the best variable subspace according to model fitting in a stepwise manner. These techniques are often computationally intensive and can require extended periods of time to run; in fact, some are prohibitively computationally expensive for high-dimensional data. In this paper, a novel variable selection technique is introduced for use in clustering and classification analyses that is both intuitive and computationally efficient. We focus largely on applications in mixture model-based learning, but the technique could be adapted for use with various other clustering/classification methods. Our approach is illustrated on both simulated and real data, highlighted by contrasting its performance with that of other comparable variable selection techniques on the real data sets. 相似文献
5.
6.
7.
Data holders, such as statistical institutions and financial organizations, have a very serious and demanding task when producing
data for official and public use. It’s about controlling the risk of identity disclosure and protecting sensitive information
when they communicate data-sets among themselves, to governmental agencies and to the public. One of the techniques applied
is that of micro-aggregation. In a Bayesian setting, micro-aggregation can be viewed as the optimal partitioning of the original
data-set based on the minimization of an appropriate measure of discrepancy, or distance, between two posterior distributions,
one of which is conditional on the original data-set and the other conditional on the aggregated data-set. Assuming d-variate
normal data-sets and using several measures of discrepancy, it is shown that the asymptotically optimal equal probability
m-partition of , with m
1/d
∈ , is the convex one which is provided by hypercubes whose sides are formed by hyperplanes perpendicular to the canonical axes,
no matter which discrepancy measure has been used. On the basis of the above result, a method that produces a sub-optimal
partition with a very small computational cost is presented.
Published online xx, xx, xxxx. 相似文献
8.
9.
Fionn Murtagh 《Journal of Classification》1998,15(2):161-183
We discuss the use of orthogonal wavelet transforms in preprocessing multivariate data for subsequent analysis, e.g., by clustering the dimensionality reduction. Wavelet transforms allow us to introduce multiresolution approximation, and multiscale nonparametric regression or smoothing, in a natural and integrated way into the data analysis. As will be explained in the first part of the paper, this approach is of greatest interest for multivariate data analysis when we use (i) datasets with ordered variables, e.g., time series, and (ii) object dimensionalities which are not too small, e.g., 16 and upwards. In the second part of the paper, a different type of wavelet decomposition is used. Applications illustrate the powerfulness of this new perspective on data analysis. 相似文献
10.
Andrzej M?odak 《Journal of Classification》2011,28(3):327-362
The paper contains a proposal of interval data clustering related to given social and economic objects characterized by many interval variables. This multivariate approach is based on an original conception of interval quantiles constructed using a special definition derived from the notion of the Hausdorff distance. In order to improve the quality of classification, the obtained interval quantile classes can be next aggregated into larger merged classes. The efficiency of our method can be assessed using especially defined indices of entropy and volume coefficients. The second notion replaces the classical concept of area, which is not applicable in this case. 相似文献
11.
关于逻辑和逻辑现代化的几个问题——评唯演绎主义 总被引:1,自引:0,他引:1
我国的五次逻辑论争表明逻辑学科本身还有不够规范和完善之处,其局限性和实效性之薄弱又难以适应当代高新科技的发展和文化的需求。逻辑学要改进、变革和发展。若仍以唯演绎主义看待逻辑和逻辑现代化则是不合时宜的。 相似文献
12.
论社会选择与自然选择之张力 总被引:3,自引:2,他引:3
社会选择与自然选择分别作为社会与自然界的内在规定,因后者的对象性关系而生成相互规约或“牵引”的张力。社会选择与自然选择张力的非和谐本性,在时空维直接表达为强弱地位的交替与置换,并以社会形态为中介,充分展示在历史发展的逻辑中。在现实性上,社会进化最终只能赖于社会选择,否则将出现“社会返祖”。社会选择的应当取向不是“社会与自然的协调”,而是社会与自然的无限“磨合”。 相似文献
13.
收词是名词审定工作中的一个重要环节。一般情况下,在学科框架体系下进行收词,所选择的文献资料当具有科学性、代表性和权威性;分批分级地收词,重视基本名词和新词的收选;所收名词汇总后,根据概念体系进行一定的编排和加工整理,以保证系统的平衡性、系统性和完备性;此外,还需进行查重。 相似文献
14.
Mokken scale analysis uses an automated bottom-up stepwise item selection procedure that suffers from two problems. First, when selected during the procedure items satisfy the scaling conditions but they may fail to do so after the scale has been completed. Second, the procedure is approximate and thus may not produce the optimal item partitioning. This study investigates a variation on Mokken’s item selection procedure, which alleviates the first problem, and proposes a genetic algorithm, which alleviates both problems. The genetic algorithm is an approximation to checking all possible partitionings. A simulation study shows that the genetic algorithm leads to better scaling results than the other two procedures. 相似文献
15.
16.
自然选择新图景——兼谈必然性和偶然性在生物进化中的作用 总被引:14,自引:0,他引:14
本文从分析达尔文自然选择与现实自然界生物多样性的矛盾,及与中性学说之争论出发,提出自然选择的实质是“最劣必汰,差别保存”的新图景。同时,对生物进化中的必然性和偶然性进行探讨。 相似文献
17.
Sometimes a larger dataset needs to be reduced to just a few points, and it is desirable that these points be representative of the whole dataset. If the future uses of these points are not fully specified in advance, standard decision-theoretic approaches will not work. We present here methodology for choosing a small representative sample based on a mixture modeling approach. 相似文献
18.
Christopher K. Eveland Diego A. Socolinsky Carey E. Priebe David J. Marchette 《Journal of Classification》2005,22(1):17-48
We describe a novel extension to the Class-Cover-Catch-Digraph (CCCD)
classifier, specifically tuned to detection problems. These are two-class classification
problems where the natural priors on the classes are skewed by several orders of magnitude.
The emphasis of the proposed techniques is in computationally efficient classification
for real-time applications. Our principal contribution consists of two boosted classi-
fiers built upon the CCCD structure, one in the form of a sequential decision process and
the other in the form of a tree. Both of these classifiers achieve performances comparable
to that of the original CCCD classifiers, but at drastically reduced computational expense.
An analysis of classification performance and computational cost is performed using data
from a face detection application. Comparisons are provided with Support Vector Machines
(SVM) and reduced SVMs. These comparisons show that while some SVMs may
achieve higher classification performance, their computational burden can be so high as to
make them unusable in real-time applications. On the other hand, the proposed classifiers
combine high detection performance with extremely fast classification. 相似文献
19.
20.
Bernard Colin François Dubeau Hussein Khreibani Jules de Tibeiro 《Journal of Classification》2013,30(3):453-473
Based on the notion of mutual information between the components of a random vector, we construct, for data reduction reasons, an optimal quantization of the support of its probability measure. More precisely, we propose a simultaneous discretization of the whole set of the components of the random vector which takes into account, as much as possible, the stochastic dependence between them. Examples are presented. 相似文献