共查询到20条相似文献,搜索用时 0 毫秒
1.
Joost van Rosmalen Patrick J. F. Groenen Javier Trejos William Castillo 《Journal of Classification》2009,26(2):155-181
Two-mode partitioning is a relatively new form of clustering that clusters both rows and columns of a data matrix. In this
paper, we consider deterministic two-mode partitioning methods in which a criterion similar to k-means is optimized. A variety of optimization methods have been proposed for this type of problem. However, it is still unclear
which method should be used, as various methods may lead to non-global optima. This paper reviews and compares several optimization
methods for two-mode partitioning. Several known methods are discussed, and a new fuzzy steps method is introduced. The fuzzy
steps method is based on the fuzzy c-means algorithm of Bezdek (1981) and the fuzzy steps approach of Heiser and Groenen (1997) and Groenen and Jajuga (2001). The performances of all methods are compared in a large simulation study. In our simulations, a two-mode k-means optimization method most often gives the best results. Finally, an empirical data set is used to give a practical example
of two-mode partitioning.
We would like to thank two anonymous referees whose comments have improved the quality of this paper. We are also grateful
to Peter Verhoef for providing the data set used in this paper. 相似文献
2.
3.
Data holders, such as statistical institutions and financial organizations, have a very serious and demanding task when producing
data for official and public use. It’s about controlling the risk of identity disclosure and protecting sensitive information
when they communicate data-sets among themselves, to governmental agencies and to the public. One of the techniques applied
is that of micro-aggregation. In a Bayesian setting, micro-aggregation can be viewed as the optimal partitioning of the original
data-set based on the minimization of an appropriate measure of discrepancy, or distance, between two posterior distributions,
one of which is conditional on the original data-set and the other conditional on the aggregated data-set. Assuming d-variate
normal data-sets and using several measures of discrepancy, it is shown that the asymptotically optimal equal probability
m-partition of , with m
1/d
∈ , is the convex one which is provided by hypercubes whose sides are formed by hyperplanes perpendicular to the canonical axes,
no matter which discrepancy measure has been used. On the basis of the above result, a method that produces a sub-optimal
partition with a very small computational cost is presented.
Published online xx, xx, xxxx. 相似文献
4.
5.
Optimal Variable Weighting for Ultrametric and Additive Trees and K-means Partitioning: Methods and Software 总被引:1,自引:0,他引:1
K -means partitioning. We also describe some new features and improvements to the algorithm proposed by De Soete. Monte Carlo simulations have been conducted using different error conditions. In all cases (i.e., ultrametric or additive trees, or K-means partitioning), the simulation results indicate that the optimal weighting procedure should be used for analyzing data containing noisy variables that do not contribute relevant information to the classification structure. However, if the data involve error-perturbed variables that are relevant to the classification or outliers, it seems better to cluster or partition the entities by using variables with equal weights. A new computer program, OVW, which is available to researchers as freeware, implements improved algorithms for optimal variable weighting for ultrametric and additive tree clustering, and includes a new algorithm for optimal variable weighting for K-means partitioning. 相似文献
6.
In this paper, we present empirical and theoretical results on classification trees for randomized response data. We considered a dichotomous sensitive response variable with the true status intentionally misclassified by the respondents using rules prescribed by a randomized response method. We assumed that classification trees are grown using the Pearson chi-square test as a splitting criterion, and that the randomized response data are analyzed using classification trees as if they were not perturbed. We proved that classification trees analyzing observed randomized response data and estimated true data have a one-to-one correspondence in terms of ranking the splitting variables. This is illustrated using two real data sets. 相似文献
7.
We consider correspondence analysis (CA) and taxicab correspondence analysis (TCA) of relational datasets that can mathematically be described as weighted loopless graphs. Such data appear in particular in network analysis. We present CA and TCA as relaxation methods for the graph partitioning problem. Examples of real datasets are provided. 相似文献
8.
术语翻译与术语标准化的相互助益之策 总被引:2,自引:0,他引:2
术语翻译和术语标准化工作之间有着密切的联系,二者之间相互影响。本文从术语翻译的角度,探讨了如何发挥术语翻译对术语标准化的积极影响问题,又从术语学建设的角度,探讨了如何促进术语的标准化翻译问题。文章指出,译者必须掌握和运用术语学知识,才能实现术语翻译的标准化和规范化,从而对术语的标准化工作产生积极影响。同时,术语的标准化工作也必须在术语学建设的宏观框架中拓展视野和范围。 相似文献
9.
术语翻译和术语标准化工作之间有着密切的联系,二者之间相互影响.本文从术语翻译的角度,探讨了如何发挥术语翻译对术语标准化的积极影响问题,又从术语学建设的角度,探讨了如何促进术语的标准化翻译问题.文章指出,译者必须掌握和运用术语学知识,才能实现术语翻译的标准化和规范化,从而对术语的标准化工作产生积极影响.同时,术语的标准化... 相似文献
10.
Classical unidimensional scaling provides a difficult combinatorial task. A procedure formulated as a nonlinear programming
(NLP) model is proposed to solve this problem. The new method can be implemented with standard mathematical programming software.
Unlike the traditional procedures that minimize either the sum of squared error (L
2 norm) or the sum pf absolute error (L
1 norm), the proposed method can minimize the error based on any L
p
norm for 1 ≤p < ∞. Extensions of the NLP formulation to address a multidimensional scaling problem under the city-block model are also
discussed. 相似文献
11.
13.
《自然辩证法研究》2019,(3):56-61
当前,尽管与气候相关的科学知识已取得了长足的进步,但公众对气候变化的认知水平并没有随着科学的进步而提高。本研究从科学哲学的角度,对当前在公众气候素养提高方面遇到的挑战进行了分析,认为气候科学本身的不确定性和争议、气候变化沟通的不确定性以及科学共识的缺乏导致公众认知之间的鸿沟加大是其主要问题。进一步研究发现,气候变化的科学不确定性短期内无法解决,但气候变化沟通的不确定性,可以通过改善气候教育的方式方法进行提高,即通过类比推理、论证推理和反驳性推理等三种方式,注重非正规教育的重要作用,以及运用跨学科的方法,对社会科学与气候变化相关知识进行融合以提高公众的气候素养。 相似文献
14.
Michael J. Brusco 《Journal of Classification》2002,19(1):45-67
L
1-norm are also presented. I conclude that the
computational scaling problems depends largely on the criterion of interest,
with unidimensional scaling problems depends largely on the criterion of
interest, with unidimensional scaling in the L
1-norm being
especially challenging. 相似文献
15.
16.
加强国家重点实验室建设 提高我国自主创新能力 总被引:2,自引:0,他引:2
同志们:
在两会即将召开之际,科技部、财政部今天召开国家重点实验室工作会议,宣布国家重点实验室专项经费的设立,并对国家重点实验室下一步工作进行部署。这是科技部、财政部贯彻落实十七大精神、组织实施《国家中长期科学与技术发展规划纲要(2006-2020年)》任务的重要举措, 相似文献
17.
一、法国国家科研中心发展方向的战略性建议和国际合作战略 法国国家科研中心 (LeCentreNationaldelaRechercheScientifique,CNRS)是法国青年、教育和研究部下属的一个公共科技性的基础科研国家组织机构。它产生知识 ,并以此为社会服务。法国国家科研中心拥有 2 6 0 0 0名工作人员 ,其中有 116 0 0名科研人员 ,14 4 0 0名工程师、技术员和行政管理人员。其 2 0 0 4年度预算升为 2 2 14亿欧元。法国国家科研中心的经费主要支持全法国国土内的涉及所有科研领域的 12 6 0个研究单位的科研和服务活动。 2 0 0 3年 7月 31日BernardLa… 相似文献
18.
19.
在企业可持续发展过程中,企业环境意识、环境愿景、企业家的环境心智模式、员工环境经验、环境能力,以及企业环境组织惯例等隐性环境知识不仅能够带来环境质量的改善和提高,而且由于其巨大的价值性、稀缺性、不易模仿性,越来越多的企业将其看成其提升绿色竞争力的战略资源。但由于受到隐性环境知识的难以收集和整理、环境知识垄断性、缺乏有效的知识产权保护措施和信用体系,以及缺乏生态系统思维模式等因素的影响,企业隐性环境知识的管理和开发成为一项难题。通过研究探索试图从开发和整合内外部隐性环境知识来获得和保持环境竞争力。企业内部的隐性环境知识管理和开发侧重于发掘员工头脑中潜在的环境想法、直觉和灵感等隐性知识,使其转化为企业环境竞争优势;企业外部的隐性环境知识管理和开发的重点是整合和利用客户、竞争者、供应商等外部隐性环境知识资源。 相似文献