一种Web信息抽取规则的优化方法 An optimization method for Web information extraction rules期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

一种Web信息抽取规则的优化方法

引用本文：	李向阳,戴江山,张亚非.一种Web信息抽取规则的优化方法[J].兰州理工大学学报,2006,32(1):90-93.

作者姓名：	李向阳戴江山张亚非

作者单位：	1. 解放军理工大学,通信工程学院,江苏,南京,210007 2. 解放军理工大学,训练部,江苏,南京,210007

基金项目：	国家自然科学基金(60303024)

摘要：	提出一种Web信息抽取规则的优化方法,用于提高信息抽取的效率.采用分级制的思想,将原有规则中的限制条件分为粗规则和细规则两部分.粗规则面向网页中所有的信息片断,用于信息的初步过滤;细规则面向过滤后的信息片断,用于抽取最终的信息.由此,避免了将规则中的限制条件应用于网页中的所有信息片断,达到了减少计算量、提高抽取速度的目的.
关键词：	分级制路径表达式信息抽取规则优化
文章编号：	1673-5196（2006）01-0090-04
收稿时间：	2005-04-30
修稿时间：	2005年4月30日
An optimization method for Web information extraction rules

LI Xiang-yang,DAI Jiang-shan,ZHANG Ya-fei.An optimization method for Web information extraction rules[J].Journal of Lanzhou University of Technology,2006,32(1):90-93.

Authors:	LI Xiang-yang DAI Jiang-shan ZHANG Ya-fei

Abstract:	An optimization method for Web information extraction rules is presented to improve the efficiency of extraction.A graduation mechanism is employed to classify the in initial rule set into rough and fine rules according to the restrictions in it.While rough rules are for the purpose of filtering all fragments in a Web page,fine rules are used for the fragments reserved by the rough rules and used to extract the(final) informations.Therefore,the employment of all restrictions in the initial rule set to all fragments can be avoided and the computation in the extraction process reduced.

Keywords:	graduation mechanism path expression information extraction rule optimization
本文献已被 CNKI 维普万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏