首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
基于特征相关的改进加权朴素贝叶斯分类算法   总被引:1,自引:0,他引:1  
朴素贝叶斯分类算法的特征项间强独立性的假设在现实中是很难满足的.为了在一定程度上放松这一假设,提出了基于特征相关的改进加权朴素贝叶斯分类算法,该算法采用一种新的权重计算方法,这种权重计算方法是在传统词频反文档频率(TF-IDF)权重计算基础上,考虑到特征项在类内和类间的分布情况,另外还结合特征项间的相关度,调整权重计算值,加大最能代表所属类的特征项的权重,将它称之为TF-IDF-FC权重计算.与基于传统TF-IDF权重的加权朴素贝叶斯分类算法和其他常用加权朴素贝叶斯分类算法比较,如基于属性加权的朴素贝叶斯分类算法,这种算法的分类效果均有一定的提高.  相似文献   

2.
为了提高高速公路交通事件检测的效果,首先从交通流基本参数、交通流组合参数、不同区间交通流参数对交通事件参数的变化进行全面的分析,构建交通事件初始特征变量集,并利用AdaBoost算法、梯度提升树(GBDT)算法、随机森林(RF)算法对初始特征变量进行筛选,通过三种方法综合比较分析得出最终的重要变量.对随机森林中的决策树进行加权计算,构建加权随机森林,并利用粒子群(PSO)算法优化加权随机森林模型.通过采集的高速公路交通事件数据进行对比分析,实验结果表明,在交通事件初始特征变量中筛选出重要变量,对检测的精度有所提高,加权随机森林的检测性能也要优于传统的支持向量机(SVM)和随机森林.  相似文献   

3.
为了剔除交通数据样本有限、事件特征变量构建相对主观且包含的信息冗余等因素对交通事件检测效果的影响,设计了一种高速公路交通事件检测方法.利用因子分析技术将交通流数据初始特征变量"降维"并提取包含全部原始数据信息的特征变量主因子;利用支持向量机完成事件检测,结合libsvm软件中的grid. py模块实现支持向量机子模型相关参数的设定;选用Fresim软件的模拟数据进行对比分析.检测结果表明,所提出的算法检测效果优势明显.  相似文献   

4.
朴素贝叶斯在处理分类问题上简单高效,通常它假设属性间是条件独立的,且各属性变量对类变量的影响程度是相同的,但在实际应用中这些都难以被满足,从而使得其分类性能降低.因此,提出基于属性约简的加权朴素贝叶斯分类算法,该算法首先根据各属性不同取值的分类能力及属性间的对称不确定性大小,去除了无关属性和冗余属性,使得筛选后的属性之间具有较低的关联度和较强的分类能力;然后再结合属性与类变量及属性间的相关性对各属性进行加权;最后对待判样本进行分类.经实验结果表明,该算法有效地提升了朴素贝叶斯的分类性能.  相似文献   

5.
一种改进的朴素贝叶斯分类器在文本分类中的应用研究   总被引:1,自引:0,他引:1  
文本分类是数据挖掘领域中重要的研究分支.通过对自适应遗传算法和朴素贝叶斯分类器的研究,提出一种基于自适应遗传算法的朴素贝叶斯分类算法.将该算法应用于中文文本分类中,可以生成最优贝叶斯分类器及最优属性集合,提高分类精度.  相似文献   

6.
针对文本分类问题,将朴素贝叶斯分类与自组织特征映射网络分类相结合,提出了基于相对特征的文本分类算法.该算法具有很快的速度和较高的准确率,从而为构建高效的搜索引擎提供支撑.  相似文献   

7.
提出了基于离散时间信号相关性的自动交通事件检测算法.将交通信息数据转化为离散时间信号并进行相关性计算,有效定位通过上、下游截面的同一组交通流.解释了互相关系数的特征,并采用仿真数据进行性能验证.结果表明:基于离散时间信号相关性的自动交通事件检测算法具有可视性且易于理解,在低饱和交通环境下表现依然稳健,具有很好的适应性.  相似文献   

8.
在手机短信的使用中,垃圾短信的数量、特征及内容均在不断地变化.传统的基于固定模式的检测方法,比如:黑白名单和基于内容检测的方法都会出现因信息更新不及时而导致的性能降低的情况.因此提出一种基于改进的朴素贝叶斯的方法以提高垃圾短信分类的性能.首先利用频繁出现的单词创建数据特征,然后找出垃圾短信和非垃圾短信的差异特征词来构建分类关键词,最后应用改进的朴素贝叶斯算法进行分类.实验结果表明,新算法可以有效地提高分类精度.  相似文献   

9.
朴素贝叶斯算法因其分类精度高、模型简单等优点而被得到普遍应用,但因为它需要具备很强的属性之间的条件独立性假设,使得其在实际分类学习中很难实现.针对这个缺点,提出了一种基于遗传算法的加权朴素贝叶斯分类算法(G_WNB).该算法将遗传算法(GA)与加权朴素贝叶斯分类算法(WNB)相结合,首先使用基于Rough Set的加权朴素贝叶斯分类算法,综合信息论与代数论给出的属性权值求解方法,计算出每个属性的权值,以初始权值作为初始种群,加权朴素贝叶斯的分类正确率为适应度函数,采用遗传算法优选,以使适应度函数最高的权值为数据集的最终权值,最后使用G_WNB进行分类.实验表明,该算法提高了分类准确率,同时提高了朴素贝叶斯分类器的性能.  相似文献   

10.
为提高电子文本分类效果,解决独立同分布模型在标记数据不足时存在的参数估计问题,提出了一种基于Nesterov平滑的高阶路径朴素贝叶斯文本分类算法.首先,利用传统意义的朴素贝叶斯事件模型构建高阶路径形式的文本分类模型,利用高阶路径中的隐式链接信息来提高文本分类模型的性能;其次,针对朴素贝叶斯事件模型中采用拉普拉斯平滑的二阶差分过程容易产生信息丢失、噪声增强的问题,提出基于Nesterov平滑的高阶路径朴素贝叶斯文本分类改进算法;最后,通过基准数据集和图书馆电子文本分类实验,验证了所提算法的有效性.  相似文献   

11.
Language markedness is a common phenomenon in languages, and is reflected from hearing, vision and sense, i.e. the variation in the three aspects such as phonology, morphology and semantics. This paper focuses on the interpretation of markedness in language use following the three perspectives, i.e. pragmatic interpretation, psychological interpretation and cognitive interpretation, with an aim to define the function of markedness.  相似文献   

12.
The discovery of the prolific Ordovician Red River reservoirs in 1995 in southeastern Saskatchewan was the catalyst for extensive exploration activity which resulted in the discovery of more than 15 new Red River pools. The best yields of Red River production to date have been from dolomite reservoirs. Understanding the processes of dolomitization is, therefore, crucial for the prediction of the connectivity, spatial distribution and heterogeneity of dolomite reservoirs.The Red River reservoirs in the Midale area consist of 3~4 thin dolomitized zones, with a total thickness of about 20 m, which occur at the top of the Yeoman Formation. Two types of replacement dolomite were recognized in the Red River reservoir: dolomitized burrow infills and dolomitized host matrix. The spatial distribution of dolomite suggests that burrowing organisms played an important role in facilitating the fluid flow in the backfilled sediments. This resulted in penecontemporaneous dolomitization of burrow infills by normal seawater. The dolomite in the host matrix is interpreted as having occurred at shallow burial by evaporitic seawater during precipitation of Lake Almar anhydrite that immediately overlies the Yeoman Formation. However, the low δ18O values of dolomited burrow infills (-5.9‰~ -7.8‰, PDB) and matrix dolomites (-6.6‰~ -8.1‰, avg. -7.4‰ PDB) compared to the estimated values for the late Ordovician marine dolomite could be attributed to modification and alteration of dolomite at higher temperatures during deeper burial, which could also be responsible for its 87Sr/86Sr ratios (0.7084~0.7088) that are higher than suggested for the late Ordovician seawaters (0.7078~0.7080). The trace amounts of saddle dolomite cement in the Red River carbonates are probably related to "cannibalization" of earlier replacement dolomite during the chemical compaction.  相似文献   

13.
何延凌 《科技信息》2008,(4):258-258
Language is a means of verbal communication. People use language to communicate with each other. In the society, no two speakers are exactly alike in the way of speaking. Some differences are due to age, gender, statue and personality. Above all, gender is one of the obvious reasons. The writer of this paper tries to describe the features of women's language from these perspectives: pronunciation, intonation, diction, subjects, grammar and discourse. From the discussion of the features of women's language, more attention should be paid to language use in social context. What's more, the linguistic phenomena in a speaking community can be understood more thoroughly.  相似文献   

14.
AcomputergeneratorforrandomlylayeredstructuresYUJia shun1,2,HEZhen hua2(1.TheInstituteofGeologicalandNuclearSciences,NewZealand;2.StateKeyLaboratoryofOilandGasReservoirGeologyandExploitation,ChengduUniversityofTechnology,China)Abstract:Analgorithmisintrod…  相似文献   

15.
理论推导与室内实验相结合,建立了低渗透非均质砂岩油藏启动压力梯度确定方法。首先借助油藏流场与电场相似的原理,推导了非均质砂岩油藏启动压力梯度计算公式。其次基于稳定流实验方法,建立了非均质砂岩油藏启动压力梯度测试方法。结果表明:低渗透非均质砂岩油藏的启动压力梯度确定遵循两个等效原则。平面非均质油藏的启动压力梯度等于各级渗透率段的启动压力梯度关于长度的加权平均;纵向非均质油藏的启动压力梯度等于各渗透率层的启动压力梯度关于渗透率与渗流面积乘积的加权平均。研究成果可用于有效指导低渗透非均质砂岩油藏的合理井距确定,促进该类油藏的高效开发。  相似文献   

16.
As an American modern novelist who were famous in the literary world, Hemingway was not a person who always followed the trend but a sharp observer. At the same time, he was a tragedy maestro, he paid great attention on existence, fate and end-result. The dramatis personae's tragedy of his works was an extreme limit by all means tragedy on the meaning of fearless challenge that failed. The beauty of tragedy was not produced on the destruction of life, but now this kind of value was in the impact activity. They performed for the reader about the tragedy on challenging for the limit and the death.  相似文献   

17.
There are numerous geometric objects stored in the spatial databases. An importance function in a spatial database is that users can browse the geometric objects as a map efficiently. Thus the spatial database should display the geometric objects users concern about swiftly onto the display window. This process includes two operations:retrieve data from database and then draw them onto screen. Accordingly, to improve the efficiency, we should try to reduce time of both retrieving object and displaying them. The former can be achieved with the aid of spatial index such as R-tree, the latter require to simplify the objects. Simplification means that objects are shown with sufficient but not with unnecessary detail which depend on the scale of browse. So the major problem is how to retrieve data at different detail level efficiently. This paper introduces the implementation of a multi-scale index in the spatial database SISP (Spatial Information Shared Platform) which is generalized from R-tree. The difference between the generalization and the R-tree lies on two facets: One is that every node and geometric object in the generalization is assigned with a importance value which denote the importance of them, and every vertex in the objects are assigned with a importance value,too. The importance value can be use to decide which data should be retrieve from disk in a query. The other difference is that geometric objects in the generalization are divided into one or more sub-blocks, and vertexes are total ordered by their importance value. With the help of the generalized R-tree, one can easily retrieve data at different detail levels.Some experiments are performed on real-life data to evaluate the performance of solutions that separately use normal spatial index and multi-scale spatial index. The results show that the solution using multi-scale index in SISP is satisfying.  相似文献   

18.
本文叙述了对海南岛及其毗邻大陆边缘白垩纪到第四纪地层岩石进行古地磁研究的全部工作过程。通过分析岩石中剩余磁矢量的磁偏角及磁倾角的变化,提出海南岛白垩纪以来经历的构造演化模式如下:早期伴随顺时针旋转而向南迁移,后期伴随逆时针转动并向北运移。联系该地区及邻区的地质、地球物理资料,对海南岛上述的构造地体运动提出以下认识:北部湾内早期有一拉张作用,主要是该作用使湾内地壳显著伸长减薄,形成北部湾盆地。从而导致了海南岛的早期构造运动,而海南岛后期的构造运动则主要是受南海海底扩张的影响。海南地体运动规律的阐明对于了解北部湾油气盆地的形成演化有重要的理论和实际意义。  相似文献   

19.
In the 19th century the society was controlled by men, and women were just appendants of them, they had not any rights and freedom. But Jane was an exception, she showed some characteristics of early feminist. Jane showed her characteristics of feminism in three aspects: rebellion, equality, and independence. These characteristics were helpful to her success, and feminism is the only way out for women of that time.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号