首页 | 本学科首页   官方微博 | 高级检索  
     

基于对比度归一化的历史文档图像二值化算法
引用本文:冯炎. 基于对比度归一化的历史文档图像二值化算法[J]. 科学技术与工程, 2019, 19(1)
作者姓名:冯炎
作者单位:西藏大学信息科学技术学院,拉萨,850000
基金项目:国家自然科学基金项目(61661047),西藏自治区高校青年教师创新支持计划项目(QCZ2016-02)
摘    要:多数历史文档图像存在背景污渍、涂抹和字迹模糊等对比度较低的情况,从而给历史文档二值化增加了较大难度。前期研究发现,历史文档中文本内容通常与文档背景的亮度水平不同,利用文档背景估计值可以有效削弱退化区域并突出字符信息,根据这两个观点,本文提出了一种基于对比度归一化的历史文档图像二值化算法。所提出的方法包含三个步骤,首先采用图像修复算法和Niblack算法结果来粗略估计背景,然后使用文档背景对历史文档图像存在的不同退化类型进行归一化处理,并对归一化处理后的文档图像进行增强、二值化,将文档中的文本分割出来。采用DIBCO数据库和H-DIBCO数据库对所提出的算法进行测试,取得了较好的实验结果。

关 键 词:对比度归一化  文档二值化  背景估计
收稿时间:2018-08-31
修稿时间:2018-10-30

Study of Historical Document Image Binarization on Contrast Normalization
Affiliation:School of Engineering, Tibet University, Lhasa Tibet
Abstract:Historical document often suffer from degradations, such as faint characters, smears and large background stains, that renders their binarization a challenging task. Motivated by the ideas that the text within document usually has a different intensity level compared with the surrounding background and the document background estimation is a way to effectively attenuate degraded regions, a new approach for the binarization of historical document is proposed in this paper. The proposed method contains three steps. First, we follow an inpainting procedure which using the Niblack binarization output to estimates the rough background. Then, image contrast normalization procedure is used to balance different types of historical document degradation by using the rough document background estimation. Finally, the document text is enhanced and segmented by an existing binarization technology from the normalized historical document images. The proposed approach has been tested on the DIBCO and H-DIBCO datasets of history document images and outperforms state-of-the-art techniques.
Keywords:contrast normalization document binarization background estimation
本文献已被 万方数据 等数据库收录!
点击此处可从《科学技术与工程》浏览原始摘要信息
点击此处可从《科学技术与工程》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号