首页 | 本学科首页   官方微博 | 高级检索  
     检索      

Integrating Multi-Source Web Records into Relational Database
作者姓名:HUANG  Jianbin  JI  Hongbing  SUN  Heli
作者单位:[1]School of Electronic Engineering, Xidian UniversitylXi'an 710071, Shaanxi, China [2]School of Computer Science, Xidian University,Xi'an 710071, Shaanxi, China [3]Department of Computer Science and Technology,Xi'an Jiaotong University, Xi'an 710049, Shaanxi, China
摘    要:How to integrate heterogeneous semi-structured Web records into relational database is an important and challengeable research topic. An improved model of conditional random fields was presented to combine the learning of labeled samples and unlabeled database records in order to reduce the dependence on tediously hand-labeled training data. The pro- posed model was used to solve the problem of schema matching between data source schema and database schema. Experimental results using a large number of Web pages from diverse domains show the novel approach's effectiveness.

关 键 词:Web  数据综合  数据挖掘  大纲匹配  条件随机场
文章编号:1007-1202(2006)05-1177-05
收稿时间:2006-03-20

Integrating multi-source Web records into relational database
HUANG Jianbin JI Hongbing SUN Heli.Integrating Multi-Source Web Records into Relational Database[J].Wuhan University Journal of Natural Sciences,2006,11(5):1177-1181.
Authors:Huang Jianbin  Ji Hongbing  Sun Heli
Institution:(1) School of Electronic Engineering, Xidian University, 710071 Xi'an, Shaanxi, China;(2) School of Computer Science, Xidian University, 710071 Xi'an, Shaanxi, China;(3) Department of Computer Science and Technology, Xi'an Jiaotong University, 710049 Xi'an, Shaanxi, China
Abstract:How to integrate heterogeneous semi-structured Web records into relational database is an important and challengeable research topic. An improved model of conditional random fields was presented to combine the learning of labeled samples and unlabeled database records in order to reduce the dependence on tediously hand-labeled training data. The proposed model was used to solve the problem of schema matching between data source schema and database schema. Experimental results using a large number of Web pages from diverse domains show the novel approach's effectiveness. Foundation item: Supported by the National Defense Pre-Research Foundation of China (410105018) Biography: HUANG Jianbin (1975-), male, Ph. D. candidate Lecturer, research directions: machine learning, Web mining.
Keywords:Web data integration  schema matching  conditional random fields
本文献已被 CNKI 维普 万方数据 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号