首页 | 本学科首页   官方微博 | 高级检索  
     检索      

汉字识别中图特征提取方法
引用本文:唐善成,梁少君,戴风华,来坤,曹瑶倩.汉字识别中图特征提取方法[J].科学技术与工程,2024,24(2):658-664.
作者姓名:唐善成  梁少君  戴风华  来坤  曹瑶倩
作者单位:西安科技大学通信与信息工程学院;中交第二公路工程局有限公司
基金项目:国家重点研发计划项目(2018YFC0808300);陕西省科技计划重点产业创新链(群)项目(2020ZDLGY15-07);西安市科技计划科技创新引导项目(201805036YD14CG20(4))
摘    要:为解决图像像素表示汉字特征方法不能有效表示汉字本质特征、空间复杂度较高的问题,提出了一种汉字图特征提取方法。方法主要包含汉字图像二值化,汉字图像骨架提取,汉字图特征提取3个部分;二值化消除图像中的噪声,提高图特征提取的准确度;骨架提取保留图像中重要的像素点,剔除无关的像素点;图特征提取将汉字关键点与图数据结构结合来表示汉字形状特征。在3 908个常用汉字的5种字体上进行实验。结果表明,该方法能够正确提取笔画复杂汉字的图特征,有效表示汉字本质特征;不同字体汉字图特征相同的汉字数量最高为3 195个,方法表现较稳定;平均每个汉字可以用22.6个图节点、19.1个边表示,相较于用单通道图像表示汉字特征,可大幅降低空间复杂度。

关 键 词:汉字识别  图特征  图数据结构
收稿时间:2023/1/12 0:00:00
修稿时间:2023/10/8 0:00:00

Graph Feature Extraction Method in Chinese Character Recognition
Tang Shancheng,Liang ShaoJun,Dai Fenghu,Lai Kun,Cao Yaoqian.Graph Feature Extraction Method in Chinese Character Recognition[J].Science Technology and Engineering,2024,24(2):658-664.
Authors:Tang Shancheng  Liang ShaoJun  Dai Fenghu  Lai Kun  Cao Yaoqian
Institution:School of Communication and Information Engineering, Xi''an University of Science and Technology
Abstract:In order to solve the problem that the method of representing Chinese character features by image pixels cannot effectively represent the essential features of Chinese characters and has high space complexity, a feature extraction method for Chinese character images is proposed. The method mainly includes three parts: binarization of Chinese character image, skeleton extraction of Chinese character image, and feature extraction of Chinese character image; binarization eliminates noise in the image and improves the accuracy of image feature extraction; skeleton extraction retains important pixels in the image, eliminates Irrelevant pixels; graph feature extraction combines Chinese character key points with graph data structures to represent Chinese character shape features. Experiments were carried out on five fonts of 3908 commonly used Chinese characters. The results show that the method can correctly extract the graph features of Chinese characters with complex strokes and effectively represent the essential features of Chinese characters; the maximum number of Chinese characters with the same graph features of different fonts is 3195, and the performance of the method is relatively stable; an average of 22.6 graph nodes can be used for each Chinese character, 19.1 edge representations, compared to using single-channel images to represent Chinese character features, can greatly reduce the space complexity.
Keywords:Chinese character recognition  graph features  graph data structure
点击此处可从《科学技术与工程》浏览原始摘要信息
点击此处可从《科学技术与工程》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号