Web信息抽取技术研究进展 Evolution of Information Extraction Techniques on the Web期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

Web信息抽取技术研究进展

引用本文：	陈少飞,郝亚南,李天柱,徐林昊,杨文柱.Web信息抽取技术研究进展[J].河北大学学报(自然科学版),2003,23(1):106-112.

作者姓名：	陈少飞郝亚南李天柱徐林昊杨文柱

作者单位：	河北大学数学与计算机学院,河北,保定,071002

摘要：	Web信息抽取技术是当今的一个研究热点。目前出现了基于不同原理的多种信息抽取技术,它们具有不同的性能。本文根据信息抽取的原理,对现有的信息抽取技术进行了分类,结合典型的系统,在语义的附加方式、模式的定义方式、规则的表现形式、语义项的定位方式、对象的定位方式等几方面进行了分析和比较,在此基础上提出了待研究的问题。
关键词：	HTML XML 语义规则信息抽取
文章编号：	1000-1565(2003)01-0106-07
修稿时间：	2002年6月7日
Evolution of Information Extraction Techniques on the Web

CHEN Shao-fei,HAO Ya-nan,LI Tian-zhu,XU Lin-hao,YANG Wen-zhu.Evolution of Information Extraction Techniques on the Web[J].Journal of Hebei University (Natural Science Edition),2003,23(1):106-112.

Authors:	CHEN Shao-fei HAO Ya-nan LI Tian-zhu XU Lin-hao YANG Wen-zhu

Abstract:	Information extraction techniques on the Web are the current research hotspot. Now many information extraction techniques based on different principle have appeared and have different capabilities. In this paper, we classify the existing information extraction techniques by the principle of information extraction and analyze the methods and principles of semantic information adding, schema defining, rule expression, semantic items locating and object locating in the approaches. Based on the above survey and analysis, several open problems are mentioned.

Keywords:	HTML XML semantics rule information extraction
本文献已被 CNKI 万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏