一种Hash高速分词算法 Fast Hash Algorithm for Chinese Word Segmentation期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

一种Hash高速分词算法

引用本文：	李向阳,张亚非. 一种Hash高速分词算法[J]. 解放军理工大学学报(自然科学版), 2004, 5(2): 40-44

作者姓名：	李向阳张亚非

作者单位：	解放军理工大学,通信工程学院,江苏,南京,210007;解放军理工大学,训练部,江苏,南京,210007

摘要：	对于基于词的搜索引擎等中文处理系统，分词速度要求较高。设计了一种高效的中文电子词表的数据结构，它支持首字和词的Hash查找。提出了一种Hash高速分词算法，理论分析表明，其平均匹配次数低于1．08，优于目前的同类算法。
关键词：	自动分词数据结构 Hash
文章编号：	1009-3443(2004)02-0040-05
修稿时间：	2003-05-27
Fast Hash Algorithm for Chinese Word Segmentation

LI Xiang-yang and ZHANG Ya-fei. Fast Hash Algorithm for Chinese Word Segmentation[J]. Journal of PLA University of Science and Technology(Natural Science Edition), 2004, 5(2): 40-44

Authors:	LI Xiang-yang and ZHANG Ya-fei

Affiliation:	LI Xiang-yang~1,ZHANG Ya-fei~2

Abstract:	The speed of Chinese word segmentation is very important for many Chinese NLP systems, such as web search engines based on words. The paper designs an efficient data structure for Chinese thesaurus, which supports hashing operations by first Chinese character of a string or the whole string. A fast Hash algorithm for Chinese word segmentation is suggested. Analysis shows that its average matching times is lower than 1.08 in theory, which is superior to that of the other algorithms for Chinese word segmentation.

Keywords:	automatic segmentation data structure Hash
本文献已被 CNKI 维普万方数据等数据库收录！
	点击此处可从《解放军理工大学学报(自然科学版)》浏览原始摘要信息
	点击此处可从《解放军理工大学学报(自然科学版)》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏