基于超链接分析的网页正文提取方法 Research on Main Text Extraction for Chinese Web Pages Based on Web Hyperlink期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

基于超链接分析的网页正文提取方法

引用本文：	任翔,刘彬.基于超链接分析的网页正文提取方法[J].泰山学院学报,2010,32(3):44-48.

作者姓名：	任翔刘彬

作者单位：	泰山学院,信息科学技术学院,山东,泰安,271021

摘要：	随着网络的迅猛发展,web服务已经成为研究的热点之一.本文介绍了一种文件类型网页文件的文本信息预处理技术.该方法能够解析网页文件的组成结构,并从中提取出主体文本以供处理.测试表明该方法能快速有效地得到大部分HTML网页的主体部分.
关键词：	网页正文 web服务超链接
Research on Main Text Extraction for Chinese Web Pages Based on Web Hyperlink

REN Xiang,LIU Bin.Research on Main Text Extraction for Chinese Web Pages Based on Web Hyperlink[J].Journal of Taishan University,2010,32(3):44-48.

Authors:	REN Xiang LIU Bin

Institution:	(School of Information Science and Technology,Taishan University,Tai＇an,271021,China)

Abstract:	With the increase of Internet,web service has been the focus of research.The paper proposes a Chinese web pages preprocessing method.The method can parse web pages,and extract the main part from the web pages.The experiment shows that the method is feasible to parse web pages.

Keywords:	main text of web pages web service hyperlink
本文献已被维普万方数据等数据库收录！