首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
通过分析中文短文本的特征,提出了一种基于语法语义的短文本相似度算法.该算法结合中文语句语义的相似性以及语句语法的相似性,即计算具有相同句法结构的短文本的相似度以及考虑语句词组顺序对相似度的贡献,对中文短文本相似度进行计算.实验表明,本文提出的算法在中文短文本相似度计算结果上更加接近人们的主观判断并且拥有比较好的精确率与召回率.  相似文献   

2.
本文针对垃圾邮件包含较多干扰信息,导致文档相似度度量效果较差的问题,将Needleman-Wunsch算法引入到文本相似度计算中,并针对性地提出一种高效的聚类算法,为反垃圾邮件系统提供了一种有效的垃圾邮件鉴别技术.与传统的仅基于知网、基于语义等聚类算法相比,本方法在算法效率和聚类质量上都有很大的改进.  相似文献   

3.
基于事件的文本相似度计算   总被引:2,自引:0,他引:2  
大量研究成果已经表明,事件在很多文本中是客观存在的.从语义的角度理解,诸多文本是由事件组成的,事件是文本表示的最小语义单位.给出了基于事件的文本表示模型,在此模型的基础上,从文本类型相似度计算和文本内容相似度计算两个层面论述了文本相似度计算的方法.  相似文献   

4.
一种提高文本聚类算法质量的方法   总被引:1,自引:0,他引:1  
针对基于VSM(vector space model)的文本聚类算法存在的主要问题,即忽略了词之间的语义信息、忽略了各维度之间的联系而导致文本的相似度计算不够精确,提出基于语义距离计算文档间相似度及两阶段聚类方案来提高文本聚类算法的质量.首先,从语义上分析文档,采用最近邻算法进行第一次聚类;其次,根据相似度权重,对类特征词进行优胜劣汰;然后进行类合并;最后,进行第二次聚类,解决最近邻算法对输入次序敏感的问题.实验结果表明,提出的方法在聚类精度和召回率上均有显著的提高,较好解决了基于VSM的文本聚类算法存在的问题.  相似文献   

5.
为了解决现有句子相似度算法未考虑句子语义信息的问题,提出了一种基于词法、句法和语义的句子相似度计算方法.将句子相似度分为词法层、句法层、语义层3个层次.在词法层,通过构建句子的词汇相似度矩阵和数字序列相似度矩阵来计算词法相似度;在句法层,使用概念词汇转化成的RDF三元组相似度来计算句法相似度;在语义层,基于本体树状结构中最短路径表示的语义距离来计算语义相似度.然后,提出句子语义相似度计算模型,采集图书领域句子对作为测试集,构建图书领域本体作为知识源.实验结果表明,所提方法具有更高的准确率和召回率,其F-度量值达0.649 9,与余弦相似度算法、基于编辑距离的算法和基于TF-IDF的算法相比分别提高约12%、17%和16%.  相似文献   

6.
针对协同过滤推荐算法没有考虑推荐对象间语义关系的问题,提出一种融合推荐对象语义相似度的改进型协同过滤推荐算法.首先利用知识图谱表示学习算法将推荐对象的语义信息嵌入到一个低维语义空间;然后计算推荐对象之间的语义相似度,把该语义相似度融合到协同过滤推荐算法的相似度计算中,弥补协同过滤推荐算法没有考虑推荐对象自身语义知识的缺陷.实验结果表明,该改进型算法相比传统协同过滤推荐算法,具有更高的准确率、召回率和覆盖率.  相似文献   

7.
顾及到地理领域语义相似度计算模型考虑因素过于单一、主观性较强等问题,针对本体模型的结构特点,提出一种计算节点密度的新方法,并从模型概念间的关系类型、节点密度、节点深度等方面分析本体概念相似度的计算,将其归并为距离因素.基于本体层次网络结构计算语义信息量,该方法不依赖于专家经验,具有客观性.结合语义距离、信息量、属性等影响相似度的因素,提出一种计算概念问语义相似度的综合算法,该算法考虑到不同的影响因子在语义相似度计算中的重要程度不同,从而赋予地理本体关系不同的权值.通过对土地利用分类中实体的语义相似度进行实例验证,表明提出的算法能有效改善语义相似度计算的准确性和有效性,能够获得更符合认知的信息检索结果.  相似文献   

8.
针对文本分类和信息检索中的信息冗余和计算复杂等问题,在概念层次网络的基础上,提出了反义词、同义词、近义词的聚类算法.算法的基本思想是将词语的语义映射到HNC概念符号体系上,将所有的词语都变成一系列符号串,并在计算语义相似度和语义距离的基础上,在词语的HNC符号语料库上实现同义、近义、反义的聚类.  相似文献   

9.
针对文本分类和信息检索中的信息冗余和计算复杂等问题,在概念层次网络的基础上,提出了反义词、同义词、近义词的聚类算法.算法的基本思想是将词语的语义映射到HNC概念符号体系上,将所有的词语都变成一系列符号串,并在计算语义相似度和语义距离的基础上,在词语的HNC符号语料库上实现同义、近义、反义的聚类.  相似文献   

10.
针对传统文本特征选择算法没有考虑特征的语义及特征与类别之间关系的问题,提出了一种结合语义和分类贡献的特征选择算法.利用LDA主题模型获取文本和词的表示,通过计算词与文本之间的语义相似度,获取词对文本的重要性.再利用Word2vec词向量模型获取文本类别特征,通过计算文本中的词与文本类别特征之间的语义相似度,获取词对类别的重要性,最后结合词对文本的重要性和词对类别的重要性选择分类贡献度高的词作为最终的分类特征.实验表明,该算法能够有效地降低文本特征数量,减少分类计算开销,降低噪声对分类的影响,提升分类效果.  相似文献   

11.
There are numerous geometric objects stored in the spatial databases. An importance function in a spatial database is that users can browse the geometric objects as a map efficiently. Thus the spatial database should display the geometric objects users concern about swiftly onto the display window. This process includes two operations:retrieve data from database and then draw them onto screen. Accordingly, to improve the efficiency, we should try to reduce time of both retrieving object and displaying them. The former can be achieved with the aid of spatial index such as R-tree, the latter require to simplify the objects. Simplification means that objects are shown with sufficient but not with unnecessary detail which depend on the scale of browse. So the major problem is how to retrieve data at different detail level efficiently. This paper introduces the implementation of a multi-scale index in the spatial database SISP (Spatial Information Shared Platform) which is generalized from R-tree. The difference between the generalization and the R-tree lies on two facets: One is that every node and geometric object in the generalization is assigned with a importance value which denote the importance of them, and every vertex in the objects are assigned with a importance value,too. The importance value can be use to decide which data should be retrieve from disk in a query. The other difference is that geometric objects in the generalization are divided into one or more sub-blocks, and vertexes are total ordered by their importance value. With the help of the generalized R-tree, one can easily retrieve data at different detail levels.Some experiments are performed on real-life data to evaluate the performance of solutions that separately use normal spatial index and multi-scale spatial index. The results show that the solution using multi-scale index in SISP is satisfying.  相似文献   

12.
13.
Future mobile communication systems aim at providing very high data transmission rates, even in high-mobility scenarios such as high-speed wheel-track trains, maglev trains, highway vehicles, airplanes, guided missiles or spacecraft. A particularly important commercial application is the strong and increasing worldwide demand for high- speed broadband wireless communications (up to 574.8 km/ h test speeds or 380 km/h commercial speeds) in railways, providing data, voice and video services for applications such as onboard entertainment services to passengers, train control, train dispatch, train sensor status handling and sur- veillance. In such high-mobility scenarios, there are a number of communication challenges, including fast hand- over, location updating, high-speed channel modeling, estimation and equalization, anti-Doppler spreading tech- niques, fast power control, and dedicated network architec- ture. Because signal transmission in very high-speed scenarios will inevitably experience serious deterioration, it is imperative to develop key broadband mobile communi- cation techniques for such high-speed vehicles.  相似文献   

14.
Instead of following Fock’s expansion,we solve the Schrodinger equation for some quantum mechanical manybody systems such as electrons in atoms and charged excitons in quantum wells in a similar way in hyperspherical coordinates by expanding the wave functions into orthonormal complete basis sets of the hyperspherical hannonics(HHs)of hyperangles and generalized Laguerre polynomials(GLPs)of the hyperradius.This leads the equation to  相似文献   

15.
Being the primary media of geographical information and the elementary objects manipulated, almost all of maps adopt the layer-based model to represent geographic information in the existent GIS. However, it is difficult to extend the map represented in layer-based model. Furthermore, in Web-Based GIS, It is slow to transmit the spatial data for map viewing. In this paper, for solving the questions above, we have proposed a new method for representing the spatial data. That is scale-based model. In this model we represent maps in three levels: scale-view, block, and spatial object, and organize the maps in a set of map layers, named Scale-View, which associates some given scales.Lastly, a prototype Web-Based GIS using the proposed spatial data representation is described briefly.  相似文献   

16.
为了有助于提高英文爱好者的个人情操、文化素养和学习兴趣,以及专业人员对英语学习和研究进行多途径的探索,本文通过一些精选诗歌的引证和分析,着重论述了喻类修辞法在英文诗歌中的运用其及效果。  相似文献   

17.
Tennessee Williams is considered as one of the most important American playwrights since World War II.The Glass Menagerie is his first successful drama,which describes a tragic situation of family and means to say that Man is unable to change the miserable life,no matter whatever means he try,This essay focuses on the analysis of the arrangement of the four main characters:Laura.Amanda,Jim and Tom to reveal the theme.Laura is fragile.Amanda is brave.Jim is vital.Tom is sensible.And all of them develop and try the different means to struggle against life,but fail tragically.With the evidence,the paper comes to conclusion naturally that Man is unable to change the miserable life,and he dooms to fail.  相似文献   

18.
The aim of this study is to investigate the diversity of Retama raetam root-nodule bacteria isolated from arid regions of Tunisia. Twelve isolates, chosen as representative for different 16S rRNA gene patterns, were characterized by 16S rRNA gene sequencing and phenotypic analysis. Isolates were assigned to Sinorhizobium, Rhizobium and Agrobacterium. Symbiotic properties of Sinorhizobium and Rhizobium isolates showed a large diversity in their capacity to infect their host plant and fix atmospheric nitrogen. Strain RK 22 identified as Rhizobium was the most effective isolate.  相似文献   

19.
正Recently,docking has been widely used to predict the binding-modes of protein-inhibitors,when the crystal complexes structure was absent.Most docking algorithms are able to generate a large number of probable conformations,it,however,is difficult to effectively evaluate these docking poses and identify the most reasonable bindingmode.In the present study,on the basis of the crystallographic data of human 3-hydroxy-3-methylglutaryl coenzyme  相似文献   

20.
介绍了WiMAX与Wi-Fi两种无线宽带接入技术,并对两者之间的关系及相互之间的影响做了对比及分析,并对WiMAX的关键技术进行了详细说明,最后对两者的联合组网方式做了简单的探讨。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号