首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
3.
In this paper we will offer a few examples to illustrate the orientation of contemporary research in data analysis and we will investigate the corresponding role of mathematics. We argue that the modus operandi of data analysis is implicitly based on the belief that if we have collected enough and sufficiently diverse data, we will be able to answer most relevant questions concerning the phenomenon itself. This is a methodological paradigm strongly related, but not limited to, biology, and we label it the microarray paradigm. In this new framework, mathematics provides powerful techniques and general ideas which generate new computational tools. But it is missing any explicit isomorphism between a mathematical structure and the phenomenon under consideration. This methodology used in data analysis suggests the possibility of forecasting and analyzing without a structured and general understanding. This is the perspective we propose to call agnostic science, and we argue that, rather than diminishing or flattening the role of mathematics in science, the lack of isomorphisms with phenomena liberates mathematics, paradoxically making more likely the practical use of some of its most sophisticated ideas.  相似文献   

4.
5.
6.
M -estimators, and building clustering objective functions. Finally, using the common thread of concavity, all three will be combined to build a comprehensive, flexible procedure for robust cluster analysis.  相似文献   

7.
8.
The paper presents a methodology for classifying three-way dissimilarity data, which are reconstructed by a small number of consensus classifications of the objects each defined by a sum of two order constrained distance matrices, so as to identify both a partition and an indexed hierarchy. Specifically, the dissimilarity matrices are partitioned in homogeneous classes and, within each class, a partition and an indexed hierarchy are simultaneously fitted. The model proposed is mathematically formalized as a constrained mixed-integer quadratic problem to be fitted in the least-squares sense and an alternating least-squares algorithm is proposed which is computationally efficient. Two applications of the methodology are also described together with an extensive simulation to investigate the performance of the algorithm.  相似文献   

9.
随着云计算等巨型、智能化工具的产生,大数据分析在我们的生产和生活中的作用越来越大,大数据分析展现为点状、预测和量化的认知结构。大数据分析虽然是对相关关系的分析,但它是通过对群集现象的搜集、记录和建模而完成的,大数据分析只是人类认知模式中的一种,并且是撇开定性的因果解释的定量分析的一种,它同质性的因果分析一起构成了人类认识方法的两翼,它不可能替代更不可能消解人类认识最为本质的因果解释。不管大数据分析今后何等智能化,它永远是人类操作和控制的计算工具。  相似文献   

10.
11.
We discuss the use of orthogonal wavelet transforms in preprocessing multivariate data for subsequent analysis, e.g., by clustering the dimensionality reduction. Wavelet transforms allow us to introduce multiresolution approximation, and multiscale nonparametric regression or smoothing, in a natural and integrated way into the data analysis. As will be explained in the first part of the paper, this approach is of greatest interest for multivariate data analysis when we use (i) datasets with ordered variables, e.g., time series, and (ii) object dimensionalities which are not too small, e.g., 16 and upwards. In the second part of the paper, a different type of wavelet decomposition is used. Applications illustrate the powerfulness of this new perspective on data analysis.  相似文献   

12.
13.
14.
15.
The triangular inequality is a defining property of a metric space, while the stronger ultrametric inequality is a defining property of an ultrametric space. Ultrametric distance is defined from p-adic valuation. It is known that ultrametricity is a natural property of spaces in the sparse limit. The implications of this are discussed in this article. Experimental results are presented which quantify how ultrametric a given metric space is. We explore the practical meaningfulness of this property of a space being ultrametric. In particular, we examine the computational implications of widely prevalent and perhaps ubiquitous ultrametricity.  相似文献   

16.
We consider two fundamental properties in the analysis of two-way tables of positive data: the principle of distributional equivalence, one of the cornerstones of correspondence analysis of contingency tables, and the principle of subcompositional coherence, which forms the basis of compositional data analysis. For an analysis to be subcompositionally coherent, it suffices to analyze the ratios of the data values. A common approach to dimension reduction in compositional data analysis is to perform principal component analysis on the logarithms of ratios, but this method does not obey the principle of distributional equivalence. We show that by introducing weights for the rows and columns, the method achieves this desirable property and can be applied to a wider class of methods. This weighted log-ratio analysis is theoretically equivalent to “spectral mapping”, a multivariate method developed almost 30 years ago for displaying ratio-scale data from biological activity spectra. The close relationship between spectral mapping and correspondence analysis is also explained, as well as their connection with association modeling. The weighted log-ratio methodology is used here to visualize frequency data in linguistics and chemical compositional data in archeology. The first author acknowledges research support from the Fundación BBVA in Madrid as well as partial support by the Spanish Ministry of Education and Science, grant MEC-SEJ2006-14098. The constructive comments of the referees, who also brought additional relevant literature to our attention, significantly improved our article.  相似文献   

17.
18.
19.
20.
在风靡全球的大数据浪潮下,大数据对传统的科学研究产生了影响。首先明确了大数据对于科学研究的影响机制:在大数据技术的支撑下,数据活动从经验层面、方法层面影响了传统的科学研究。进而指出这种影响造成了科学研究在三个方面的转变,分别是:研究对象、研究层次以及研究类型。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号