首页 | 本学科首页   官方微博 | 高级检索  
     检索      

一种新的字符特征提取方法及其在识别中的两个应用
引用本文:李佐,王姝华,蔡士杰.一种新的字符特征提取方法及其在识别中的两个应用[J].南京大学学报(自然科学版),2002,38(1):90-98.
作者姓名:李佐  王姝华  蔡士杰
作者单位:南京大学软件新技术国家重点实验室,南京210093
摘    要:特征抽取是识别中的重要步骤,提取描述性强的特征能够有效提高分类器的识别效率。在提出有关概念的基础上,介绍了特征行抽取在字符过程中的两个重要应用。首先,在分类识别时可用作匹配的特征向量,通过双向匹配来识别独立的字符。其次,可在识别粘连字符时用于预测前端字符,并在提取前端字符后对预测结果进行验证,从而达到准确分割和识别粘连字符的目的。还详细描述了交互确定特征行的方法。最后根据实验数据对字符特征行的应用价值作出了评价。

关 键 词:OCR  特征提取  字符特征行  字符分割  字符屏蔽码  特征识别  分类器

A New Method for Character Feature Extraction and Its Two Applications in Recognition
Li Zuo,Wang Shuhua,Cai Shijie.A New Method for Character Feature Extraction and Its Two Applications in Recognition[J].Journal of Nanjing University: Nat Sci Ed,2002,38(1):90-98.
Authors:Li Zuo  Wang Shuhua  Cai Shijie
Abstract:The performance of character recognition depends heavily on what features are used. Firstly, a kind of feature of character called feature line is defined in this paper. Then, an important usage of the feature line in character classification is presented. By extracting the feature lines of characters from bitmap and testing the necessary-sufficient condition (The necessary condition is defined as the feature line is within the sample's corresponding line and the sufficient condition is defined as the feature line involves the sample's corresponding line.), the classifier can carry out the recognition with a very high efficiency. The recognition rate of this method is 99.68% in the experiment with some common documents as samples. Another important usage of feature line is also described, which is used for segmentation of merged characters-one of the difficulties that have achieved a great deal of attention in Optical Character Recognition (OCR). Nowadays, unsuitable segmentation of merged characters is the primary cause for recognition errors. An algorithm for segmentation and recognition of merged characters based on the prediction of the first merged character is presented, and the most important step is the necessary condition match of feature lines. This algorithm is effective, feasible and also robust while cutting complicated merged characters even with bad quality of images. A method of gaining the feature lines in the template of characters is also described in detail. Finally, the performance evaluation about the usage of feature line extraction in these two processes is given.
Keywords:OCR  feature extraction  character feature lines  character segmentation  character shielding code
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号