首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 109 毫秒
1.
一种编辑距离算法及其在网页搜索中的应用   总被引:1,自引:0,他引:1  
针对传统方法不能很好地处理网页中简短域与用户查询之间的相关性排序问题,提出一种基于改进的编辑距离排序算法.将以词为单位的用户查询和简短网页域通过匹配编码转化为2个字符串,再利用改进的编辑距离计算2个字符串之间的相似性.由于在用户查询与待比较的简短网页域之间引入了查询词分布的位置、顺序和距离等,以及含有查询词修饰关系的重要信息,所以编码字符串之间的相似程度可以衡量对应的查询与简短网页域之间的相关性.经大规模真实搜索引擎实验表明,该算法较之传统的相关性排序算法,可以显著地提高网页搜索中的简短网页域相关性排序性能,尤其适用于简短域与用户查询之间的相关性比较.  相似文献   

2.
 为提高政府网站的搜索质量并优化网站内容, 对某政府网站现有搜索系统进行二次开发, 增加了日志挖掘模块、行为分析模块、系统改进模块, 实现了对搜索系统日志挖掘和用户行为的分析处理。日志挖掘模块负责收集、过滤和识别用户的搜索操作记录;在行为分析模块, 根据操作记录从查询过程、聚类分析和查询热词3 个角度, 分析用户行为的特点和规律, 得到了待调整权重的网页和热点查询词等分析结果;在系统改进模块, 通过调整网页的权重使查询结果更加精准, 改善了搜索系统, 根据统计查询热词, 既提供了搜索热点等新功能, 又为用户提供了个性化网页并优化了政府网站的内容, 实现了与舆情系统的数据交互。通过这些优化和改进, 从多方面使搜索系统和政府网站能更好的为用户服务。  相似文献   

3.
针对个性化搜索引擎中不能准确建立用户兴趣模型的问题,通过分析用户有效的搜索行为,得到用户感兴趣的网页,并根据用户对网页的操作情况,得到用户对网页的兴趣程度,据此改进TF-IDF公式,得到用户的兴趣特征词及其权重,建立用户兴趣模型;并根据更新时间因子来实现用户兴趣模型的动态更新.实验结果表明,通过此种方法建立的用户兴趣模型,更加准确地反映了用户的兴趣爱好,提高了搜索结果的准确度.  相似文献   

4.
基于查询\|概念的用户兴趣模型构建   总被引:1,自引:0,他引:1  
针对查询\|概念二分图因概念抓取和查询词权重设计不足而导致构建的用户兴趣模型不合理的问题, 提出一种基于查询\|概念二分图的用户兴趣建模算法。通过tf×idf公式抓取概念, 并利用用户对查询词的浏览时间计算查询词的权重, 确保改进后的查询\|概念二分图能更准确地表示用户的查询意图。实验结果表明, 该算法构建的用户兴趣更为合理。  相似文献   

5.
文本观点检索旨在检索出与查询主题相关并且表达用户对主题观点的文档。由于用户查询时输入通常很短,难以准确表示查询的信息需求。知识图谱是结构化的语义知识库,通过知识图谱中的知识有助于理解用户的信息需求。因此,提出了一种基于知识图谱的文本观点检索方法。首先由知识图谱获取候选查询扩展词,并计算每个候选词扩展词分布、共现频率、邻近关系、文档集频率,然后利用4类特征通过SVM分类得到扩展词,最后利用扩展词对产生式观点检索模型进行扩展,实现对查询的观点检索。实验表明,在微博和推特两个数据集上,与基准工作对比,所提出的方法在MAP、NDCG等评价指标上均有显著的提升。  相似文献   

6.
基于动态知识库搜索引擎的技术   总被引:2,自引:0,他引:2  
“词的不匹配”是全文信息检索中存在的一个基本问题.为解决此问题,已提出过一些查询扩展方法.现提出一种新的基于动态知识库的搜索引擎原型——DKIRS检索系统.它利用用户检索的结果及用户的反馈信息动态地构造知识库,然后基于知识库对初始查询进行扩展,再利用扩展后的查询进行信息检索。  相似文献   

7.
当前主流的搜索引擎根据查询词在网页中的出现频率,辅以网页权威性等信息,生成查询结果.但用户提供的查询词往往非常简单,因此搜索引擎难以确定用户的查询意图.为此,给出了一种利用海量clickthrough数据进行网页内容相关性挖掘的方法,在此基础上给出了一种反馈式搜索引擎(FSE)框架及相关算法.FSE根据网页相关性动态生成查询结果,以期提供给用户更中肯和个性化的信息.基于真实点击数据,进行了网页相关性矩阵的压缩实验和有效性实验,证明了该框架的可行性.  相似文献   

8.
传统的基于关键词匹配的查询方法因查询词短少,微博博文短小,容易引起歧义性,对查询效率有较大影响.提出一种基于本体和局部查询反馈的微博查询扩展算法,首先结合安全领域文档构建安全领域本体知识库,然后利用本体提供的语义知识对初始查询词进行扩展,再结合局部查询反馈对候选扩展词集进行筛选,最后通过二次查询和迭代操作得到最终查询结果.实验结果表明,基于本体和局部查询反馈的微博查询扩展算法比基于关键词的查询扩展算法、基于本体的查询扩展算法和基于"伪相关反馈"的查询扩展算法有更好的查全率和查准率.  相似文献   

9.
为解决信息检索中用户查询可能与索引文档信息表示不匹配从而影响检索效果的问题,提出一种融合局部共现和上下文相似度的查询扩展方法,从与查询词具有共现关系的邻接词和与查询词具有高相关性或同指关系的词两个方面对用户输入查询词进行扩展,重点测试邻接词的取词窗口大小以及上下文向量的最优长度。试验表明:与采用单一扩展方法相比,融合方法的平均准确率取得了明显提高,当邻接词的窗口大小取5,上下文向量的长度取15时,具有更好的平均准确率。  相似文献   

10.
传统的查询扩展技术大都依据单个查询词的相关性来扩展查询词,忽略了查询词之间的相关性以及查询扩展词的不同重要程度,使得扩展效果不佳。针对此问题,提出了一种基于PageRank算法的查询扩展模型,该模型在Markov网络检索模型的基础上,从查询本身出发,将所有与查询相关的词组成Markov查询关联子网,在此子网上应用PageRank算法来计算候选扩展词的权重,由权重序来确定扩展词的选取,排名前列的扩展词进入检索阶段,消除噪音,提高检索效率。在标准数据集上的实验结果表明,本文提出的模型能有效地改善检索效果。  相似文献   

11.
The discovery of the prolific Ordovician Red River reservoirs in 1995 in southeastern Saskatchewan was the catalyst for extensive exploration activity which resulted in the discovery of more than 15 new Red River pools. The best yields of Red River production to date have been from dolomite reservoirs. Understanding the processes of dolomitization is, therefore, crucial for the prediction of the connectivity, spatial distribution and heterogeneity of dolomite reservoirs.The Red River reservoirs in the Midale area consist of 3~4 thin dolomitized zones, with a total thickness of about 20 m, which occur at the top of the Yeoman Formation. Two types of replacement dolomite were recognized in the Red River reservoir: dolomitized burrow infills and dolomitized host matrix. The spatial distribution of dolomite suggests that burrowing organisms played an important role in facilitating the fluid flow in the backfilled sediments. This resulted in penecontemporaneous dolomitization of burrow infills by normal seawater. The dolomite in the host matrix is interpreted as having occurred at shallow burial by evaporitic seawater during precipitation of Lake Almar anhydrite that immediately overlies the Yeoman Formation. However, the low δ18O values of dolomited burrow infills (-5.9‰~ -7.8‰, PDB) and matrix dolomites (-6.6‰~ -8.1‰, avg. -7.4‰ PDB) compared to the estimated values for the late Ordovician marine dolomite could be attributed to modification and alteration of dolomite at higher temperatures during deeper burial, which could also be responsible for its 87Sr/86Sr ratios (0.7084~0.7088) that are higher than suggested for the late Ordovician seawaters (0.7078~0.7080). The trace amounts of saddle dolomite cement in the Red River carbonates are probably related to "cannibalization" of earlier replacement dolomite during the chemical compaction.  相似文献   

12.
There are numerous geometric objects stored in the spatial databases. An importance function in a spatial database is that users can browse the geometric objects as a map efficiently. Thus the spatial database should display the geometric objects users concern about swiftly onto the display window. This process includes two operations:retrieve data from database and then draw them onto screen. Accordingly, to improve the efficiency, we should try to reduce time of both retrieving object and displaying them. The former can be achieved with the aid of spatial index such as R-tree, the latter require to simplify the objects. Simplification means that objects are shown with sufficient but not with unnecessary detail which depend on the scale of browse. So the major problem is how to retrieve data at different detail level efficiently. This paper introduces the implementation of a multi-scale index in the spatial database SISP (Spatial Information Shared Platform) which is generalized from R-tree. The difference between the generalization and the R-tree lies on two facets: One is that every node and geometric object in the generalization is assigned with a importance value which denote the importance of them, and every vertex in the objects are assigned with a importance value,too. The importance value can be use to decide which data should be retrieve from disk in a query. The other difference is that geometric objects in the generalization are divided into one or more sub-blocks, and vertexes are total ordered by their importance value. With the help of the generalized R-tree, one can easily retrieve data at different detail levels.Some experiments are performed on real-life data to evaluate the performance of solutions that separately use normal spatial index and multi-scale spatial index. The results show that the solution using multi-scale index in SISP is satisfying.  相似文献   

13.
AcomputergeneratorforrandomlylayeredstructuresYUJia shun1,2,HEZhen hua2(1.TheInstituteofGeologicalandNuclearSciences,NewZealand;2.StateKeyLaboratoryofOilandGasReservoirGeologyandExploitation,ChengduUniversityofTechnology,China)Abstract:Analgorithmisintrod…  相似文献   

14.
本文叙述了对海南岛及其毗邻大陆边缘白垩纪到第四纪地层岩石进行古地磁研究的全部工作过程。通过分析岩石中剩余磁矢量的磁偏角及磁倾角的变化,提出海南岛白垩纪以来经历的构造演化模式如下:早期伴随顺时针旋转而向南迁移,后期伴随逆时针转动并向北运移。联系该地区及邻区的地质、地球物理资料,对海南岛上述的构造地体运动提出以下认识:北部湾内早期有一拉张作用,主要是该作用使湾内地壳显著伸长减薄,形成北部湾盆地。从而导致了海南岛的早期构造运动,而海南岛后期的构造运动则主要是受南海海底扩张的影响。海南地体运动规律的阐明对于了解北部湾油气盆地的形成演化有重要的理论和实际意义。  相似文献   

15.
Various applications relevant to the exciton dynamics,such as the organic solar cell,the large-area organic light-emitting diodes and the thermoelectricity,are operating under temperature gradient.The potential abnormal behavior of the exicton dynamics driven by the temperature difference may affect the efficiency and performance of the corresponding devices.In the above situations,the exciton dynamics under temperature difference is mixed with  相似文献   

16.
The elongation method,originally proposed by Imamura was further developed for many years in our group.As a method towards O(N)with high efficiency and high accuracy for any dimensional systems.This treatment designed for one-dimensional(ID)polymers is now available for three-dimensional(3D)systems,but geometry optimization is now possible only for 1D-systems.As an approach toward post-Hartree-Fock,it was also extended to  相似文献   

17.
18.
The explosive growth of the Internet and database applications has driven database to be more scalable and available, and able to support on-line scaling without interrupting service. To support more client's queries without downtime and degrading the response time, more nodes have to be scaled up while the database is running. This paper presents the overview of scalable and available database that satisfies the above characteristics. And we propose a novel on-line scaling method. Our method improves the existing on-line scaling method for fast response time and higher throughputs. Our proposed method reduces unnecessary network use, i.e. , we decrease the number of data copy by reusing the backup data. Also, our on-line scaling operation can be processed parallel by selecting adequate nodes as new node. Our performance study shows that our method results in significant reduction in data copy time.  相似文献   

19.
R-Tree is a good structure for spatial searching. But in this indexing structure,either the sequence of nodes in the same level or sequence of traveling these nodes when queries are made is random. Since the possibility that the object appears in different MBR which have the same parents node is different, if we make the subnode who has the most possibility be traveled first, the time cost will be decreased in most of the cases. In some case, the possibility of a point belong to a rectangle will shows direct proportion with the size of the rectangle. But this conclusion is based on an assumption that the objects are symmetrically distributing in the area and this assumption is not always coming into existence. Now we found a more direct parameter to scale the possibility and made a little change on the structure of R-tree, to increase the possibility of founding the satisfying answer in the front sub trees. We names this structure probability based arranged R-tree (PBAR-tree).  相似文献   

20.
The geographic information service is enabled by the advancements in general Web service technology and the focused efforts of the OGC in defining XML-based Web GIS service. Based on these models, this paper addresses the issue of services chaining,the process of combining or pipelining results from several interoperable GIS Web Services to create a customized solution. This paper presents a mediated chaining architecture in which a specific service takes responsibility for performing the process that describes a service chain. We designed the Spatial Information Process Language (SIPL) for dynamic modeling and describing the service chain, also a prototype of the Spatial Information Process Execution Engine (SIPEE) is implemented for executing processes written in SIPL. Discussion of measures to improve the functionality and performance of such system will be included.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号