首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 820 毫秒
1.
聚类是数据挖掘中重要的功能算法,其主要的功能是发现数据中潜在的知识.目前文献发表的聚类算法多数仅限于处理单一数值型数据或者分类型数据,其主要原因是含有多种类型的混合型数据间的相似性很难度量.本文提出了一种混合数据相似性度量方法:对于分类型属性,利用互信息构建贝叶斯信念网络,利用贝叶斯信念网络构建关系层次,继而为层次附上距离,形成关系层次距离,而对于数值型属性则利用标准化的曼哈顿距离来度量其相似性,最后结合分类型属性与数值型属性来对整个数据集进行相似性的度量.在此基础上,设计实现了用于混合型数据聚类算法CRHD,并通过UCI中的多个数据集和已有算法进行仿真实验对比,证明了CRHD算法的有效性.  相似文献   

2.
针对传统的基于距离/相关系数的相似性度量方法无法有效度量基因间的时延表达特性,为了更加准确地刻画基因间的共调控关系,提出一种基于动态时间弯曲距离(DTW)的相似性度量方法,并结合可指定类数的仿射传播聚类算法进行聚类.将该算法用于人工合成数据和真实的酵母基因数据集,实验结果表明,相对于其它经典聚类算法,本文所提算法能得到更好的聚类结果.  相似文献   

3.
【目的】在没有先验知识的前提下,采用基于粒子群优化算法(PSO)的加权模糊C-均值(WFCM)聚类算法,从30多万条记录的医疗保险数据中挖掘出疑似医疗保险欺诈的记录。【方法】首先,引用改进的欧式距离、相似性函数以及交叉熵函数并通过PSO算法极小化交叉熵函数,对属性权重进行分析;其次,选取Calinski-Harabasz(CH)有效性指标,展开聚类有效性的研究;然后,基于数据预处理的结果将数据运用于PSO算法,不断更新得到各属性的权重,并运用聚类有效性评价中的CH有效性指标来动态估计最佳聚类个数,提高FCM聚类的速度;最后,将属性权重和最佳聚类数应用于FCM聚类算法,根据隶属度矩阵聚类得到疑似医疗保险欺诈结果。【结果】基于上述研究方法,本研究根据最后的隶属度矩阵来进行聚类分析。【结论】将优化的权重应用于加权FCM聚类算法与聚类有效性评价,既提高了聚类算法的高效性,又避免了主观评价对分类的影响。  相似文献   

4.
针对待聚类的数据对象的对称性,提出了一种基于对称点距离的蚂蚁聚类算法.该算法不再采用Euclidean距离来计算类内对象的相似性,而是使用新的对称点距离来计算相似性.实验结果表明:与标准的蚂蚁聚类算法相比,该算法在处理带有对称性质的数据集时,可以更好的识别数据集的聚类数目和划分.  相似文献   

5.
为提高列车车轮踏面检测效率,设计了一套基于机器视觉的车轮踏面动态检测系统,分析了k-means聚类算法,通过加权欧式距离对该算法进行改进,利用聚类法具有保持最大相似性的特性,将基于加权欧式距离的k-means聚类算法用于机器视觉的图像处理。先对原始图像作图像增强、图像灰度化等预处理,再以特征聚类思想对图像作阈值分割,使图像中的各部分特征更加突出。图像处理结果显示,基于加权欧式距离k-means聚类算法的车轮踏面损伤视觉检测系统可以有效地检测出踏面损伤。  相似文献   

6.
许多聚类算法有两个缺点:1)采用某种距离作为相似性测度。类别接受域为球形,不能与复杂模式分布匹配;2)对确定合理类别数不能提供任何帮助。采用最大似然准则的聚类算法其类别接受域为球形或椭球形,可以与模式的分布匹配更好。在计算似然值时使用先验概率,能为确定合理的类别数提供依据。本文的贡献是把遗传算法结合到基于最大似然准则的神经网络聚类算法中,解决聚类中心的初值选择问题并获得最优聚类。  相似文献   

7.
聚类分析是一种数据缩减技术,即基于数据特征的相似性将数据聚集成不同的类,是数据挖掘中一种非常有效的工具,得到了人们广泛的关注。从聚类算法中的相似性度量问题入手,采用基于流形距离的相似性度量替代传统的基于欧氏距离的相似性度量,通过二阶段聚类解决引入流形距离带来的计算量增大问题,并将这种聚类算法应用到聚类分析当中。  相似文献   

8.
K-Modes算法是一种经典的字符型数据聚类算法,在处理对象属性值距离时,采用简单的0-1匹配,不能体现出属性值之间潜在的相似关系.通过图形聚类理论中的连接度来度量字符型属性值之间的相似性,改进了传统的K-Modes算法.实验结果表明该方法较传统的K-Modes算法有一定的改善.  相似文献   

9.
针对谱聚类算法中常用的K-means算法对特征向量空间进行聚类初始值敏感等问题,提出了一种新的基于仿射传播(AP)的谱聚类算法。首先,利用动态时间规整(DTW)距离度量各船舶自动识别系统(AIS)轨迹之间的结构相似性,得到距离矩阵;其次,使用快速AP聚类算法改进传统谱聚类算法,基于指定的类别数对内河桥区水域船舶AIS轨迹数据进行实例验证。仿真实验结果表明:本文算法在不增加时间复杂度的基础上,比传统谱聚类算法有更高的鲁棒性,且实验准确率提高5.24%。  相似文献   

10.
软件成本数据常常表现为高维混合属性数据,传统的相似性度量已不再适用.文中通过建立软件成本数据的高维模糊C均值(FCM)聚类算法对数据相似性进行度量.首先,定义由序数属性到数值属性的初始映射;然后,通过建立改进的迭代高维FCM聚类算法对序数 数值映射进行修正,优化聚类效果;最后,利用得到的模糊划分矩阵对软件成本数据的相似性进行度量.实验结果表明,通过对聚类效果进行优化,文中定义的相似性度量能够提高软件成本估算精度.  相似文献   

11.
Language markedness is a common phenomenon in languages, and is reflected from hearing, vision and sense, i.e. the variation in the three aspects such as phonology, morphology and semantics. This paper focuses on the interpretation of markedness in language use following the three perspectives, i.e. pragmatic interpretation, psychological interpretation and cognitive interpretation, with an aim to define the function of markedness.  相似文献   

12.
The discovery of the prolific Ordovician Red River reservoirs in 1995 in southeastern Saskatchewan was the catalyst for extensive exploration activity which resulted in the discovery of more than 15 new Red River pools. The best yields of Red River production to date have been from dolomite reservoirs. Understanding the processes of dolomitization is, therefore, crucial for the prediction of the connectivity, spatial distribution and heterogeneity of dolomite reservoirs.The Red River reservoirs in the Midale area consist of 3~4 thin dolomitized zones, with a total thickness of about 20 m, which occur at the top of the Yeoman Formation. Two types of replacement dolomite were recognized in the Red River reservoir: dolomitized burrow infills and dolomitized host matrix. The spatial distribution of dolomite suggests that burrowing organisms played an important role in facilitating the fluid flow in the backfilled sediments. This resulted in penecontemporaneous dolomitization of burrow infills by normal seawater. The dolomite in the host matrix is interpreted as having occurred at shallow burial by evaporitic seawater during precipitation of Lake Almar anhydrite that immediately overlies the Yeoman Formation. However, the low δ18O values of dolomited burrow infills (-5.9‰~ -7.8‰, PDB) and matrix dolomites (-6.6‰~ -8.1‰, avg. -7.4‰ PDB) compared to the estimated values for the late Ordovician marine dolomite could be attributed to modification and alteration of dolomite at higher temperatures during deeper burial, which could also be responsible for its 87Sr/86Sr ratios (0.7084~0.7088) that are higher than suggested for the late Ordovician seawaters (0.7078~0.7080). The trace amounts of saddle dolomite cement in the Red River carbonates are probably related to "cannibalization" of earlier replacement dolomite during the chemical compaction.  相似文献   

13.
AcomputergeneratorforrandomlylayeredstructuresYUJia shun1,2,HEZhen hua2(1.TheInstituteofGeologicalandNuclearSciences,NewZealand;2.StateKeyLaboratoryofOilandGasReservoirGeologyandExploitation,ChengduUniversityofTechnology,China)Abstract:Analgorithmisintrod…  相似文献   

14.
理论推导与室内实验相结合,建立了低渗透非均质砂岩油藏启动压力梯度确定方法。首先借助油藏流场与电场相似的原理,推导了非均质砂岩油藏启动压力梯度计算公式。其次基于稳定流实验方法,建立了非均质砂岩油藏启动压力梯度测试方法。结果表明:低渗透非均质砂岩油藏的启动压力梯度确定遵循两个等效原则。平面非均质油藏的启动压力梯度等于各级渗透率段的启动压力梯度关于长度的加权平均;纵向非均质油藏的启动压力梯度等于各渗透率层的启动压力梯度关于渗透率与渗流面积乘积的加权平均。研究成果可用于有效指导低渗透非均质砂岩油藏的合理井距确定,促进该类油藏的高效开发。  相似文献   

15.
As an American modern novelist who were famous in the literary world, Hemingway was not a person who always followed the trend but a sharp observer. At the same time, he was a tragedy maestro, he paid great attention on existence, fate and end-result. The dramatis personae's tragedy of his works was an extreme limit by all means tragedy on the meaning of fearless challenge that failed. The beauty of tragedy was not produced on the destruction of life, but now this kind of value was in the impact activity. They performed for the reader about the tragedy on challenging for the limit and the death.  相似文献   

16.
本文叙述了对海南岛及其毗邻大陆边缘白垩纪到第四纪地层岩石进行古地磁研究的全部工作过程。通过分析岩石中剩余磁矢量的磁偏角及磁倾角的变化,提出海南岛白垩纪以来经历的构造演化模式如下:早期伴随顺时针旋转而向南迁移,后期伴随逆时针转动并向北运移。联系该地区及邻区的地质、地球物理资料,对海南岛上述的构造地体运动提出以下认识:北部湾内早期有一拉张作用,主要是该作用使湾内地壳显著伸长减薄,形成北部湾盆地。从而导致了海南岛的早期构造运动,而海南岛后期的构造运动则主要是受南海海底扩张的影响。海南地体运动规律的阐明对于了解北部湾油气盆地的形成演化有重要的理论和实际意义。  相似文献   

17.
There are numerous geometric objects stored in the spatial databases. An importance function in a spatial database is that users can browse the geometric objects as a map efficiently. Thus the spatial database should display the geometric objects users concern about swiftly onto the display window. This process includes two operations:retrieve data from database and then draw them onto screen. Accordingly, to improve the efficiency, we should try to reduce time of both retrieving object and displaying them. The former can be achieved with the aid of spatial index such as R-tree, the latter require to simplify the objects. Simplification means that objects are shown with sufficient but not with unnecessary detail which depend on the scale of browse. So the major problem is how to retrieve data at different detail level efficiently. This paper introduces the implementation of a multi-scale index in the spatial database SISP (Spatial Information Shared Platform) which is generalized from R-tree. The difference between the generalization and the R-tree lies on two facets: One is that every node and geometric object in the generalization is assigned with a importance value which denote the importance of them, and every vertex in the objects are assigned with a importance value,too. The importance value can be use to decide which data should be retrieve from disk in a query. The other difference is that geometric objects in the generalization are divided into one or more sub-blocks, and vertexes are total ordered by their importance value. With the help of the generalized R-tree, one can easily retrieve data at different detail levels.Some experiments are performed on real-life data to evaluate the performance of solutions that separately use normal spatial index and multi-scale spatial index. The results show that the solution using multi-scale index in SISP is satisfying.  相似文献   

18.
19.
The elongation method,originally proposed by Imamura was further developed for many years in our group.As a method towards O(N)with high efficiency and high accuracy for any dimensional systems.This treatment designed for one-dimensional(ID)polymers is now available for three-dimensional(3D)systems,but geometry optimization is now possible only for 1D-systems.As an approach toward post-Hartree-Fock,it was also extended to  相似文献   

20.
Various applications relevant to the exciton dynamics,such as the organic solar cell,the large-area organic light-emitting diodes and the thermoelectricity,are operating under temperature gradient.The potential abnormal behavior of the exicton dynamics driven by the temperature difference may affect the efficiency and performance of the corresponding devices.In the above situations,the exciton dynamics under temperature difference is mixed with  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号