A probabilistic DEDICOM model was proposed for mobility tables. The model attempts to explain observed transition probabilities
by a latent mobility table and a set of transition probabilities from latent classes to observed classes. The model captures
asymmetry in observed mobility tables by asymmetric latent mobility tables. It may be viewed as a special case of both the
latent class model and DEDICOM with special constraints. A maximum penalized likelihood (MPL) method was developed for parameter
estimation. The EM algorithm was adapted for the MPL estimation. Two examples were given to illustrate the proposed method.
The work reported in this paper has been supported by grant A6394 to the first author from the Natural Sciences and Engineering
Research Council of Canada and by a fellowship of the Royal Netherlands Academy of Arts and Sciences to the second author.
We propose a non-negative real-valued model of hierarchical classes (HICLAS) for two-way two-mode data. Like the other members
of the HICLAS family, the non-negative real-valued model (NNRV-HICLAS) implies simultaneous hierarchically organized classifications
of all modes involved in the data. A distinctive feature of the novel model is that it yields continuous, non-negative real-valued
reconstructed data, which considerably expands the application range of the HICLAS family. The expansion implies a major algorithmic
challenge as it involves a move from the typical discrete optimization problems in HICLAS to a mixed discrete-continuous one.
To solve this mixed discrete-continuous optimization problem, a two-stage algorithm combining a simulated annealing and an
alternating local descent stage is proposed. Subsequently it is evaluated in a simulation study. Finally, the NNRVHICLAS model
is applied to an empirical data set on anger. 相似文献
Jacob Stegenga 《Foundations of Science》2016,21(1):35-49
Consensus conferences are social techniques which involve bringing together a group of scientific experts, and sometimes also non-experts, in order to increase the public role in science and related policy, to amalgamate diverse and often contradictory evidence for a hypothesis of interest, and to achieve scientific consensus or at least the appearance of consensus among scientists. For consensus conferences that set out to amalgamate evidence, I propose three desiderata: Inclusivity (the consideration of all available evidence), Constraint (the achievement of some agreement of intersubjective assessments of the hypothesis of interest), and Evidential Complexity (the evaluation of available evidence based on a plurality of relevant evidential criteria). Two examples suggest that consensus conferences can readily satisfy Inclusivity and Evidential Complexity, but consensus conferences do not as easily satisfy Constraint. I end by discussing the relation between social inclusivity and the three desiderata. 相似文献
L. Andries van der Ark Peter G. M. van der Heijden Dirk Sikkel 《Journal of Classification》1999,16(1):117-137
model
A major drawback of the latent budget model is that, in general, the
model is not identifiable, which complicates the interpretation of the
model considerably. This paper studies the geometry and identifiability
of the latent budget model. Knowledge of the geometric structure of the
model is used to specify an appropriate criterion to identify the model.
The results are illustrated by an empirical data set. 相似文献
从不同领域信息学的比较研究再论信息的本质 总被引:1,自引:0,他引:1
论文试图从不同领域信息学的比较研究入手,揭示各种信息学的研究在做什么?它们所提出的信息到底指的是什么?分析在各种信息学的研究过程中所产生的一些共性问题,如信息是否守恒,信息有无新旧之分以及部分信息概念和相关概念的意义与合理性,提出对信息本质的一些思考,即信息是什么,信息何以为信息以及信息的客观性问题. 相似文献
规范的中文名词需对概念的科学定义反映得贴切。所谓“科学定义”,是指在科学上完满的定义。所谓“完满”,是指所下定义与概念的外延相等,既未缩小又未超出。这就出现了一个问题:如何判断对一概念所下定义是否完满,即完满定义的判据是什么。一先来看一个实例:<例一>,《辞海》(上海辞书出版社1979年版)里对“钢”所下的定义是:“含碳量0.025-2%的铁基合金的总称”。按此定义,含碳量为2.00-2.30%、我国标号为Cr12的冷形变模具钢等就不叫“钢”了;而含碳量为0.50-0.85%、我国标号为STsi14的高硅耐蚀铸铁等等就又叫“钢”了。由此可见,“钢”的上述定义是不完满的,也是不正确的。象这样的例子,在词典之类的术语集里常常出现。下面再列举不同来源的几个:<例二>“亚共析钢(hypoeutectoid steel)〔冶〕含碳量低于0.8%的钢”〔1〕。按此定义,含碳量为0.6%、含锰量为4%的共析钢,含碳量只有0.4%但完全退火组织中已含有二次碳化物(包括二次渗碳体)的某些高合金钢(过共析钢),就都叫“亚共析钢”了。<例三>“过共析钢(hypereutectoid steel)〔冶〕含碳量高于0.8%的钢”〔1〕。按此定义,含碳量只有0.4%但完全退火组织中已含有二次碳化物(包括二次渗碳体)的某些高合金钢(过共析钢),就不叫“过共析钢”了。<例四>“二元合金(binary alloy)〔冶〕由两种主要金属成分组成的合金”〔1〕。按此定义,由一种金属元素与一种非金属元素(比如铁与碳)组成的合金,就不叫“二元合金”了。<例五>“无磁性钢(nonmagnetic steel)〔冶〕含有约12%锰,有时含有少量的镍的合金钢;在常温下几乎没有磁性”〔1〕。按此定义,完全退火状态没有铁磁性的奥氏体不锈钢(例如含铬18%、含镍8%的不锈钢)等就不叫“无磁性钢”了。<例六>“化合碳(combined carbon)〔冶〕铸铁中以碳化铁形式出现的碳”〔1〕。按此定义,钢中以碳化铁形式出现的碳以及钢和铸铁中不是以碳化铁形式而是以其它种碳化物形式出现的碳,就都不叫“化合碳”了。<例七>“铁素体:碳溶于α-Fe或δ-Fe中形成的间隙固溶体”〔2〕。按此定义,不是碳而是氮或其它间隙式元素”溶于α-Fe或δ-Fe中形成的间隙式固溶体以及不是间隙式元素而是代位式元素(例如Cr、Ni、Si等等)溶于α-Fe或δ-Fe中形成的代位式固溶体,就都不叫“铁素体”了。<例八>“珠光体:铁素体和渗碳体的机械混合物(Fe+Fe3C),一般以一片铁素体一片渗碳体相间呈片层状存在”〔2〕。按此定义,以铁素体片层和渗碳体以外的碳化物片层交替重叠而构成的珠光体(例如,我国标号为Cr12钢中的珠光体——由铁素体片层和(Fe,Cr)7C3型碳化物片层所构成),就不叫“珠光体”了。<例九>“奥氏体:碳在γ-Fe中的固溶体,在合金钢中则是碳和合金元素溶于γ-Fe中所形成的固溶体”〔3〕。按此定义,Fe-N系、Fe-Ni系、Fe-Cr系、Fe-Cr-Ni系、Fe-Cr-Mn系等等之内的奥氏体(都不含碳),就都不叫“奥氏体”了。<例十>“莱氏体:铁-碳系中,奥氏体和渗碳体的共晶体”〔4〕。按此定义,除了铁—碳二元系以外的铁基合金中的共晶体以及不是奥氏体和渗碳体两者组成的共晶体(例如高速工具钢、Cr12型高合金工具钢中的莱氏体),就都不叫“莱氏体”了。由上面对十个定义的分析可见,这些定义都是不完满的,也是不正确的。给概念下了不完满的定义这个问题,不仅我国词典类资料中常出现,而且其它国家的词典类工具书中也常出现,比如,上列实例二—实例五所列的定义,经查对,都是译自“McGraw-Hil Dictionary of Scientific and Technical Terms,1978”;而实例六的定义,是译自“Dictionary of Science and Technology,T.C.Collocott et al,1974.”。这后两种词典都是广为使用的英文科技词典。由此可见,给概念下不完满的定义,是个带有普遍性的问题。所以出现不完满定义考其原因,多半是由于将概念在某局部的属性(或特征)被当作了它的本质属性(或特征)(上列十例皆如此);有时则是由于与当今的世界科技水平不相符合,即所反映的是过去某一时期(比如只是50年代甚至更早)的概念①。为使我们当前的定名工作中不因依据的定义不完满而定名不当或错误,为使以后撰写定义时不出现不完满的定义,探索“定义完满与否的判据”,是十分必要的。二由于编辑本专业多语种释义词典的需要,我们曾经对诸多中外文词典进行了分析研究,发现:凡一名词给了不完满的定义者,其‘否命题’都不能成立。比如,对前面<例一>所列定义而言,其‘否命题’是“凡不是‘含碳量0.025-2%的铁基合金’都不叫‘钢’”,由该例中对它的分析可见,此否命题是不能成立的;对<例二>而言,其否命题“凡不是‘含碳量低于0.8%的钢’都不叫‘亚共析钢’”,也不能成立;对<例三>而言,其否命题“凡不是‘含碳量高于0.80%的钢’都不叫‘过共析钢’”,亦不能成立;余类推。若将“钢”定义为“初始状态为铸造状态(或者说是‘以液体状态出炉’)、在某些温度区间内可予以形变加工的铁基合金’,则其否命题“凡不是‘初始状态为铸造状态,在某些温度区间内可予以形变加工的铁基合金’都不叫‘钢’”,能够成立。若将“亚共析钢”定义为‘其化学成分为低于共析成分的钢”或者“其完全退火组织为粗珠光体+先共析铁素体的钢”,则其否命题“凡其化学成分不是低于共析成分的钢’都不叫‘亚共析钢’”或者“凡其完全退火组织不是粗珠光体+先共析铁素体的钢’都不叫‘亚共析钢’”,皆能成立。若将“过共析钢”定义为“其化学成分为超过共析成分的钢”或者“其完全退火组织为粗珠光体+二次碳化物(包括二次渗碳体)的钢”,则其否命题都能够成立。若将前面所列其余实例中的“名词”分别给出下列定义,则其否命题都能够成立:(1)二元合金:由两个组元(其中至少一个为金属元素)形成的合金。(2)无磁性钢:实际上没有铁磁性从而不能予以磁化的钢。(3)化合碳:在铁基合金中,与铁和/或其它金属元素形成的金属碳化物里的碳。(4)铁素体:铁与一种或数种其它元素(或者说是“铁与碳和/或其它元素”)所形成的、体心立方结构的固溶体。(5)珠光体:铁素体片层和碳化物(包括渗碳体)片层交替重叠的层状组织。(6)莱氏体:铁基合金在凝固过程中发生共晶相变所形成的、由奥氏体和碳化物(包括渗碳体)所组成的共晶体。本文所列的十个名词,是冶金科学领域所特有的极常用名词。上面所列的后十个定义,在当今整个冶金科学领域里都能普遍适用,没有例外,从而都是完满的。这样,它们的“否命题”必然都能成立。三由本文的分析可以看出,对当今任一科学领域而言,凡一名词的定义是不完满的,则其否命题不能成立;凡一名词的定义是完满的,则其否命题能够成立。总括言之,“其否命题能否成立”是一名词的定义完满与否的判据;换言之,“一名词的完满定义的判据”是“其否命题能够成立”。 ①本文不涉及由于种种原因而造成的根本错误的定义,比如:(1)马氏体区域(martensite range)〔冶〕开始形成马氏体的温度(Ms)和完全形成马氏体的温度(Mf)之间的温度区间〔1〕;(2)自然时效(natural aging)〔冶〕超饱和金属固溶体在室温下自然冷却的时效〔1〕;(3)球化组织:金属经热处理后,可获得的分布在金属基体组织上的粒状碳化物〔2〕。 相似文献
Andrzej M?odak 《Journal of Classification》2011,28(3):327-362
The paper contains a proposal of interval data clustering related to given social and economic objects characterized by many interval variables. This multivariate approach is based on an original conception of interval quantiles constructed using a special definition derived from the notion of the Hausdorff distance. In order to improve the quality of classification, the obtained interval quantile classes can be next aggregated into larger merged classes. The efficiency of our method can be assessed using especially defined indices of entropy and volume coefficients. The second notion replaces the classical concept of area, which is not applicable in this case. 相似文献
k . In this procedure, a least-squares loss function in terms of discrepancies between D and M is minimized. The present paper
describes the original hierarchical classes algorithm proposed by De Boeck and Rosenberg (1988), which is based on an alternating
greedy heuristic, and proposes a new algorithm, based on an alternating branch-and-bound procedure. An extensive simulation
study is reported in which both algorithms are evaluated and compared according to goodness-of-fit to the data and goodness-of-recovery
of the underlying true structure. Furthermore, three heuristics for selecting models of different ranks for a given D are
presented and compared. The simulation results show that the new algorithm yields models with slightly higher goodness-of-fit
and goodness-of-recovery values. 相似文献
Charles Bouveyron 《Journal of Classification》2014,31(1):49-84
In supervised learning, an important issue usually not taken into account by classical methods is that a class represented in the test set may have not been encountered earlier in the learning phase. Classical supervised algorithms will automatically label such observations as belonging to one of the known classes in the training set and will not be able to detect new classes. This work introduces a model-based discriminant analysis method, called adaptive mixture discriminant analysis (AMDA), which can detect several unobserved groups of points and can adapt the learned classifier to the new situation. Two EM-based procedures are proposed for parameter estimation and model selection criteria are used for selecting the actual number of classes. Experiments on artificial and real data demonstrate the ability of the proposed method to deal with complex and real-world problems. The proposed approach is also applied to the detection of unobserved communities in social network analysis. 相似文献
北京的城市病之所以比其他城市更为严重,根本之点在于城市为维系自身的运行而实际上步入了工业化竞争的行列,而工业化又借助政治中心和文化中心的优势实现了加速发展.这个问题依靠新建卫星城是解决不了的,一条根本出路在于实行京津合并,让天津成为北京的工业因区,以经济功能的剥离实现首都城区的卸荷,从而确保政治中心和文化中心功能的充分发挥. 相似文献
论现代科技革命引发的信息传播中的问题及其对策 总被引:6,自引:0,他引:6
现代科技革命对信息传播方式带来巨大影响。进入多媒体和网络传播的时代,面临着信息安全与知识产权保护、信息污染、信息障碍、信息资源配置和信息控制等问题,需要研究对策,保障信息的有效传播,促进科技、经济和社会的协调发展。 相似文献
a posteriori blockmodeling for graphs is proposed. The model assumes that the vertices of the graph are partitioned into two unknown blocks
and that the probability of an edge between two vertices depends only on the blocks to which they belong. Statistical procedures
are derived for estimating the probabilities of edges and for predicting the block structure from observations of the edge
pattern only. ML estimators can be computed using the EM algorithm, but this strategy is practical only for small graphs.
A Bayesian estimator, based on the Gibbs sampling, is proposed. This estimator is practical also for large graphs. When ML
estimators are used, the block structure can be predicted based on predictive likelihood. When Gibbs sampling is used, the
block structure can be predicted from posterior predictive probabilities. A side result is that when the number of vertices
tends to infinity while the probabilities remain constant, the block structure can be recovered correctly with probability
tending to 1. 相似文献
Daniël W. van der Palm L. Andries van der Ark Jeroen K. Vermunt 《Journal of Classification》2016,33(1):52-72
Traditionally latent class (LC) analysis is used by applied researchers as a tool for identifying substantively meaningful clusters. More recently, LC models have also been used as a density estimation tool for categorical variables. We introduce a divisive LC (DLC) model as a density estimation tool that may offer several advantages in comparison to a standard LC model. When using an LC model for density estimation, a considerable number of increasingly large LC models may have to be estimated before sufficient model-fit is achieved. A DLC model consists of a sequence of small LC models. Therefore, a DLC model can be estimated much faster and can easily utilize multiple processor cores, meaning that this model is more widely applicable and practical. In this study we describe the algorithm of fitting a DLC model, and discuss the various settings that indirectly influence the precision of a DLC model as a density estimation tool. These settings are illustrated using a synthetic data example, and the best performing algorithm is applied to a real-data example. The generated data example showed that, using specific decision rules, a DLC model is able to correctly model complex associations amongst categorical variables. 相似文献
Latent class (LC) analysis is used by social, behavioral, and medical science researchers among others as a tool for clustering (or unsupervised classification) with categorical response variables, for analyzing the agreement between multiple raters, for evaluating the sensitivity and specificity of diagnostic tests in the absence of a gold standard, and for modeling heterogeneity in developmental trajectories. Despite the increased popularity of LC analysis, little is known about statistical power and required sample size in LC modeling. This paper shows how to perform power and sample size computations in LC models using Wald tests for the parameters describing association between the categorical latent variable and the response variables. Moreover, the design factors affecting the statistical power of these Wald tests are studied. More specifically, we show how design factors which are specific for LC analysis, such as the number of classes, the class proportions, and the number of response variables, affect the information matrix. The proposed power computation approach is illustrated using realistic scenarios for the design factors. A simulation study conducted to assess the performance of the proposed power analysis procedure shows that it performs well in all situations one may encounter in practice. 相似文献
Probabilistic feature models (PFMs) can be used to explain binary rater judgements about the associations between two types of elements (e.g., objects and attributes) on the basis of binary latent features. In particular, to explain observed object-attribute associations PFMs assume that respondents classify both objects and attributes with respect to a, usually small, number of binary latent features, and that the observed object-attribute association is derived as a specific mapping of these classifications. Standard PFMs assume that the object-attribute association probability is the same according to all respondents, and that all observations are statistically independent. As both assumptions may be unrealistic, a multilevel latent class extension of PFMs is proposed which allows objects and/or attribute parameters to be different across latent rater classes, and which allows to model dependencies between associations with a common object (attribute) by assuming that the link between features and objects (attributes) is fixed across judgements. Formal relationships with existing multilevel latent class models for binary three-way data are described. As an illustration, the models are used to study rater differences in product perception and to investigate individual differences in the situational determinants of anger-related behavior. 相似文献
情报学家尤金·加菲尔德对STS问题研究的贡献 总被引:1,自引:0,他引:1
本文考察了美国著名情报学家尤金.加菲尔德对科技与社会(STS)研究所做的贡献。运用WordSmithTools词频分析软件对加菲尔德所著1447篇(部)论著的标题进行分析,并结合部分论文摘要和全文,我们对这位情报学家在STS研究领域的活动以及主要成就有了较深入的了解。加菲尔德的贡献主要分为两个方面。一是间接的方法论的贡献,这主要指他创建的SCI等大型引文索引数据库为STS研究提供了数据形式的经验素材,他参与创建的引文分析方法为STS研究提供了一种工具。二是他直接从事了STS问题的研究,例如,运用引文分析方法对诺贝尔奖获得者进行研究,以及在科技政策、科技伦理、科学交流等方面发表自己的见解。 相似文献
弗洛里迪信息伦理学的主体间性本质评析 总被引:1,自引:0,他引:1
弗洛里迪(Luciano Floridi)的信息哲学和信息伦理学涉及一些特定的概念,其中关于信息的"外延论的语义学观点"、"客体导向程序化方法论"和"建构主义伦理学"这些概念不仅涉及到作者信息哲学的本体论立场,而且还涉及到当今信息社会带有基础性的伦理问题.我们将借助古典的先验方法阐述这些概念的统一性及其主体间性本质. 相似文献
国家自然科学基金委员会参照借鉴国际大科学研究计划的组织经验和资助方式,推出了自然科学基金重大科学研究计划."光电信息功能材料"是首批启动、试点实施的重大科学研究计划之一.本文简要介绍了该重大研究计划的科学目标、研究内容和管理模式等. 相似文献