基于填充标记的自适应Web信息提取 Self-learning extraction of Web information based on filling-tag期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

基于填充标记的自适应Web信息提取

引用本文：	李永平,金莉.基于填充标记的自适应Web信息提取[J].华中科技大学学报(自然科学版),2003,31(11):31-32.

作者姓名：	李永平金莉

作者单位：	华中科技大学计算机科学与技术学院

基金项目：	国家高性能计算基金资助项目 (993 1 9)

摘要：	提出一种自适应Web信息提取算法，基于自底向上规则模块层叠，通过在提取模板中填充一定数量有助于识别信息类别的SGML标记，较好地覆盖Web页中不可见信息，有效控制自适应过程中信息的过少和溢出，实现智能化Web信息提取．
关键词：	Web信息提取填充标记自适应规则推导
文章编号：	1671-4512(2003)11-0031-02
修稿时间：	2003年5月28日
Self-learning extraction of Web information based on filling-tag

Li Yongping Jin Li Dr., College of Computer Sci. & Tech.,Huazhong Univ. of Sci. & Tech.,Wuhan ,China..Self-learning extraction of Web information based on filling-tag[J].JOURNAL OF HUAZHONG UNIVERSITY OF SCIENCE AND TECHNOLOGY.NATURE SCIENCE,2003,31(11):31-32.

Authors:	Li Yongping Jin Li Dr College of Computer Sci & Tech Huazhong Univ of Sci & Tech Wuhan China

Institution:	Li Yongping Jin Li Dr., College of Computer Sci. & Tech.,Huazhong Univ. of Sci. & Tech.,Wuhan 430074,China.

Abstract:	A self-learning algorithm for Web information extraction was presented based on bottom-up cascade. Filling lots of SGML tags is helpful to find the type of the information to cover unseen information on the Web pages on the extraction template, and the too less and overflow of information was controlled in the process of learning and an intelligent extraction of Web information was realized.

Keywords:	Web information extraction filling-tag self-learning rule induction
本文献已被 CNKI 维普万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏