基于特征和HMM的信息提取 Information Extraction Based on Character Extraction and HMM期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

基于特征和HMM的信息提取

引用本文：	纪祥,刘华虓,吴芬芬,刘磊.基于特征和HMM的信息提取[J].吉林大学学报(信息科学版),2009,27(4):396-399.

作者姓名：	纪祥刘华虓吴芬芬刘磊

作者单位：	吉林大学计算机科学与技术院,长春 130012

基金项目：	中国高等教育博士研究基金资助项目(20060183044)

摘要：	为了解决在信息提取中,召回率和精度都不高的问题，提出了改进的HMM(Hidden Markov Models)模型，该模型采用一种新的文本分块技术。通过文本的语义特征和结构特征,抽取具有特征的状态,并在此基础上,抽取剩余的无特征的状态改进HMM,测试了由卡耐基梅隆大学数据搜索引擎研究小组所提供的100篇计算机科学文件头部。结果表明，与基于字词和传统的HMM方法相比,召回率和精确率分别达到了91.99％和94.79％。
关键词：	文本块特征提取机器学习 HMM模型
Information Extraction Based on Character Extraction and HMM

JI Xiang,LIU Hua-xiao,WU Fen-fen,LIU Lei.Information Extraction Based on Character Extraction and HMM[J].Journal of Jilin University:Information Sci Ed,2009,27(4):396-399.

Authors:	JI Xiang LIU Hua-xiao WU Fen-fen LIU Lei

Institution:	College of Computer Science and Technology,Jilin University,Changchun 130012,China

Abstract:	An improved HMM(Hidden Markov Models) was proposed for text information extraction by utilizing the semanteme characteristic and structure characteristic of the text to make certain the states with characteristic.We carry on extracting the remainder states having no characteristic with the improved HMM.It can solve the problem which the recall rate and the precision rate are not high in information extraction.We have tested 100 pieces of headers of computer science paper of the data provided by the search-e...

Keywords:	text block characterextraction machine learning hidden markov models(HMM)
本文献已被 CNKI 维普万方数据等数据库收录！
	点击此处可从《吉林大学学报(信息科学版)》浏览原始摘要信息
	点击此处可从《吉林大学学报(信息科学版)》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏