首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于xml的DeepWeb信息自动抽取技术的研究
引用本文:彭媛媛,许建潮.基于xml的DeepWeb信息自动抽取技术的研究[J].科技信息,2009(33):85-85,104.
作者姓名:彭媛媛  许建潮
作者单位:长春工业大学,吉林长春130012
摘    要:随着近年来Internet的飞速发展,Deepweb已成为网络信息资源的重要组成部分,用户通过查询接口在线访问其后端的Web数据库来动态的获取其中蕴含的海量信息。由于DeepWeb资源分布在各个De印web站点,具有异构、动态、数据量大等特点,使用起来较为不便,因此,面向Deep Web的数据集成系统便应运而生。本文对Deepweb数据集成系统中的数据抽取技术进行了研究,提出了基于xml的Deepweb数据自动抽取方法,并作了详细的技术分析与研究,它能够快速有效地抽取出DeepWeb资源,具有抽取准确度高,抽取粒度细等特点。

关 键 词:信息提取  DeepWeb  DeepWeb数据集成  xml

Deep Web Information Automatic Extraction Technology Based On XML
PENG Yuan-yuan,XU Jian-chao.Deep Web Information Automatic Extraction Technology Based On XML[J].Science,2009(33):85-85,104.
Authors:PENG Yuan-yuan  XU Jian-chao
Institution:(Changchun University of Technology, Changchun Jilin,130012)
Abstract:With the rapid development of Internet in recent years, Deep W~b has become an important part of network information resources, the tremendous information can only be accessed by the query interfaces provided by Web database. The data in Deep Web are obtained in the form of dynamic Web pages when users send a query. As the Deep Web resources are located in various Deep Web site, with a heterogeneous, dynamic, large volumes of data and other characteristics, and inconvenient to use, therefore, the Deep Web data integration systems emerged. In this paper, we researched the data extraction technology in Deep Web Data Integration System, and proposed Deep Web data automatic extraction method based on xml, and has a detailed technical analysis and research for that. The system can quickly and efficiently extracted out of Deep Web resources, has drawn high accuracy and fine granularity extraction and so on.
Keywords:Information extraction  Deep Web  Deep Web data integration  xml
本文献已被 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号