首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Focused crawlers are important tools to support applications such as specialized Web portals, online searching, and Web search engines. A topic driven crawler chooses the best URLs and relevant pages to pursue during Web crawling. It is difficult to deal with irrelevant pages. This paper presents a novel focused crawler framework. In our focused crawler, we propose a method to overcome some of the limitations of dealing with the irrelevant pages. We also introduce the implementation of our focused crawler and present some important metrics and an evaluation function for ranking pages relevance. The experimental result shows that our crawler can obtain more "important" pages and has a high precision and recall value.  相似文献   

2.
Automatic Web services composition based on SLM   总被引:1,自引:0,他引:1  
Motivated by the problem of simplifying the manual operation of the composition process, we propose an approach to automatically compose available Web services to fulfill user's goal based on the assumption that there are a set of alternative Web services with similar functionality and different QoS properties. A formal model (i.e. semantic links matrix, SLM for short ) is proposed to store semantic links values for the Web services with semantic relationship and QoS of Web services. The SLM provides a search place for a backward-search planning algorithm, at the same time; the QoS criteria make a rational and effective decision among a number of similar Web services. The function and some properties of the algorithm are analyzed. The approach can improve the correctness and flexibility for Web services composition and satisfy the local QoS attribute.  相似文献   

3.
To alleviate the scalability problem caused by the increasing Web using and changing users' interests, this paper presents a novel Web Usage Mining algorithm-Incremental Web Usage Mining algorithm based on Active Ant Colony Clustering. Firstly, an active movement strategy about direction selection and speed, different with the positive strategy employed by other Ant Colony Clustering algorithms, is proposed to construct an Active Ant Colony Clustering algorithm, which avoid the idle and "flying over the plane" moving phenomenon, effectively improve the quality and speed of clustering on large dataset. Then a mechanism of decomposing clusters based on above methods is introduced to form new clusters when users' interests change. Empirical studies on a real Web dataset show the active ant colony clustering algorithm has better performance than the previous algorithms, and the incremental approach based on the proposed mechanism can efficiently implement incremental Web usage mining.  相似文献   

4.
Caching is an important technique to enhance the efficiency of query processing. Unfortunately, traditional caching mechanisms are not efficient for deep Web because of storage space and dynamic maintenance limitations. In this paper, we present on providing a cache mechanism based on Top-K data source (KDS-CM) instead of result records for deep Web query. By integrating techniques from IR and Top-K, a data reorganization strategy is presented to model KDS-CM. Also some measures about cache management and optimization are proposed to improve the performances of cache effectively. Experimental results show the benefits of KDS-CM in execution cost and dynamic maintenance when compared with various alternate strategies.  相似文献   

5.
通过给出狭义超树与广义超树的定义,利用超图圈数的计算公式及超图对应的二部图,得到了一系列狭义超树和广义超树之间关系的有意义命题,进一步完善了超树的理论系统.  相似文献   

6.
This paper proposes a new approach for classification for query interfaces of Deep Web, which extracts features from the form's text data on the query interfaces, assisted with the synonym library, and uses radial basic function neural network (RBFNN) algorithm to classify the query interfaces. The applied RBFNN is a kind of effective feed-forward artificial neural network, which has a simple networking structure but features with strength of excellent nonlinear approximation, fast convergence and global convergence. A TEL_8 query interfaces' data set from UIUC on-line database is used in our experiments, which consists of 477 query interfaces in 8 typical domains. Experimental results proved that the proposed approach can efficiently classify the query interfaces with an accuracy of 95.67%.  相似文献   

7.
0 Introduction Substitution and permutation network (SPN) structure is one of the most widely used structures in block ciphers. The SPN structure is based on Shannon’s principles of confusion and diffusion[1] and these principles are implemented through …  相似文献   

8.
Deep Web sources contain a large of high-quality and query-related structured date. One of the challenges in the Deep Web is extracting result schemas of Deep Web sources. To address this challenge, this paper describes a novel approach that extracts both result data and the result schema of a Web database. The approach first models the query interface of a Deep Web source and fills in it with a specifically query instance. Then the result pages of the Deep Web sources are formatted in the tree structure to retrieve subtrees that contain elements of the query instance, Next, result schema of the Deep Web source is extracted by matching the subtree' nodes with the query instance, in which, a two-phase schema extraction method is adopted for obtaining more accurate result schema. Finally, experiments on real Deep Web sources show the utility of our approach, which provides a high precision and recall.  相似文献   

9.
To facilitate users to access the desired information, many researches have dedicated to the Deep Web (i.e. Web databases) integration. We focus on query translation which is an important part of the Deep Web integration. Our aim is to construct automatically a set of constraints mapping rules so that the system can translate the query from the integrated interface to the Web database interfaces based on them. We construct a concept hierarchy for the attributes of the query interfaces, especially, store the synonyms and the types (e.g. Number, Text, etc.) for every concept At the same time, we construct the data hierarchies for some concepts if necessary. Then we present an algorithm to generate the constraint mapping rules based on these hierarchies. The approach is suitable for the scalability of such application and can be extended easily from one domain to another for its domain independent feature. The results of experiment show its effectiveness and efficiency.  相似文献   

10.
A new design of LHM (left handed material) is suggested, in which the wave vector k and the energy flow S (the Poyming veclor) are in the opposite direction. Metallic cores or lines are coated with ferromagnetic layers to obtain negative permittivity and permeability. This design may bring some improvements over the binary design, such as higher homogeneity, smaller volume size, lower power loss, higher convenience and economy. The analytical expressions for the permiltivily s and permeability μ are shown to be negative in certain direction and frequency regions. Two specific structures are theoretically discussed and proved to be left-handed.  相似文献   

11.
Web offers a very convenient way to access remote information resources, an important measurement of evaluating Web services quality is how long it takes to search and get information. By caching the Web server‘s dynamic content, it can avoid repeated queries for database and reduce the access frequency of original resources, thus to improve the speed of server‘s response. This paper describes the concept. advantages, principles and concrete realization procedure of a dvnamic content cache module for Web server.  相似文献   

12.
A novel personalized Web search model is proposed. The new system, as a middleware between a user and a Web search engine, is set up on the client machine. It can learn a user's preference implicitly and then generate the user profile automatically. When the user inputs query keywords, the system can automatically generate a few personalized expansion words by computing the term-term associations according to the current user profile, and then these words together with the query keywords are submitted to a popular search engine such as Yahoo or Google. These expansion words help to express accurately the user's search intention. The new Web search model can make a common search engine personalized, that is, the search engine can return different search results to different users who input the same keywords. The experimental results show the feasibility and applicability of the presented work.  相似文献   

13.
The demand for individualized teaching from Elearning websites is rapidly increasing due to the huge differences existed among Web learners. A method for clusteringWeb learners based on rough set is proposed. The basic ideaof the method is to reduce the learning auributes prior to clustering, and therefore the clustering of Web learners iscarried out in a relative low-dimensional space. Using thismethod, the E-learning websites can arrange correspondingleaching content for different clusters of learners so that thelearners‘ individual requirements can be more satisfied.  相似文献   

14.
In this paper, a Web service composition architecture based on structured P2P network is proposed. Semantics is used to achieve service accurately matching and user personality customization. Through Web service virtual mapping(WVM) association the fast computing of distributed service composition based on the service function is also implemented. The Web service composition architecture and distributed service composition algorithm proposed in this paper solve a series of existent problems in ser- vice discovery and composition in distributed environment, and provide a service composition result meeting user personality requirement. At the same time, they improve the efficiency of service composition calculation.  相似文献   

15.
A passage retrieval strategy for web-based question answering (QA) systems is proposed in our QA system. It firstly analyzes the question based on semantic patterns to obtain its syntactic and semantic information and then form initial queries. The queries are used to retrieve documents from the World Wide Web (WWW) using the Google search engine. The queries are then rewritten to form queries for passage retrieval in order to improve the precision. The relations between keywords in the question are employed in our query rewrite method. The experimental result on the question set of the TREC-2003 passage task shows that our system performs well for factoid questions.  相似文献   

16.
在对3种de novo(从头)序列拼接的基本策略进行分析的基础上,该文研究了混合策略序列拼接算法的构造过程,从而整合多个单一策略优点; 再利用形式化方法和形式化平台方面的优势,结合领域分析建模和产生式编程的方法,构造了2个基于OLC策略的算法(OLC_assembly_1,OLC_assembly_2)及1个基于DBG策略的算法(DBG_assembly),进一步组装出在(OLC+DBG)→OLC混合模式下的算法(简称ODO算法); 最后,从GenBank中选取了3个实验样本,从N50、Contigs number、Coverage等角度,比较了在3个单一策略下的算法和ODO构造算法的拼接结果,分析了coverage depth和k值的变化对拼接结果的影响.实验结果表明:该文实现的ODO算法比单一策略在序列拼接时所产生的结果在N50和Coverage等参数上均有一定的优势.  相似文献   

17.
The main aim of this paper is to establish several new criteria on the attractor for the solutions of neutral stochastic functional differential equations. A kind of ψ-function is introduced to our discussion, and some results on the attractor for the product of the ψ-function and the solutions are obtained. As a byproduct, a number of new criteria on asymptotic stability are also shown. Biography: CEN Liqun(1975–), female, Ph.D. candidate, research direction: theory and applications of stochastic differential equations.  相似文献   

18.
提出一种新的MMDB的物理组织方法——主存数据库偶图方法,详细描述了它的物理实现和其上的数据操作.研究表明,这种方法提高了系统的空间利用率和存取性能,是实现MMDB的一种有效方法.  相似文献   

19.
20.
The Web cluster has been a popular solution of network server system because of its scalability and cost effective ness. The cache configured in servers can result in increasing significantly performance, In this paper, we discuss the suitable configuration strategies for caching dynamic content by our experimental results. Considering the system itself can provide support for caching static Web page, such as computer memory cache and disk's own cache, we adopt a special pattern that only caches dynamic Web page in some experiments to enlarge cache space. The paper is introduced three different replacement algorithms in our cache proxy module to test the practical effects of caching dynamic pages under different conditions. The paper is chiefly analyzed the influences of generated time and accessed frequency on caching dynamic Web pages. The paper is also provided the detailed experiment results and main conclusions in the paper.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号