首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于OCR信息的JBIG2编码算法
引用本文:尚俊卿,刘长松,丁晓青.基于OCR信息的JBIG2编码算法[J].清华大学学报(自然科学版),2006,46(7):1247-1249.
作者姓名:尚俊卿  刘长松  丁晓青
作者单位:清华大学,电子工程系,智能技术与系统国家重点实验室,北京,100084
摘    要:二值图像编码在文本存储、图象检索中有广泛的应用。为了提高二值图像的压缩比,提出了一种利用OCR结果的JB IG 2(jo in t b i-leve l im age group)编码算法。它在对二值文本图像进行基于模式匹配的压缩时,利用了OCR识别结果和识别置信度的信息,从而更好地完成了字模重建和模式匹配的处理,提高了JB IG 2算法的性能。图像中所有识别结果可信的字符被重建字模代替,编码器只需编码字符的位置。实验结果表明:该算法优于以往JB IG 2算法的效果,它可以获得高于以往有损压缩算法的图像质量,并在实验图像上得到高于以往无损压缩算法14.3%的压缩比。

关 键 词:模式识别  二值图像编码  文本图像压缩  模式匹配
文章编号:1000-0054(2006)07-1247-03
修稿时间:2005年5月12日

Lossy JBIG2 based on optical character recognition
SHANG Junqing,LIU Changsong,DING Xiaoqing.Lossy JBIG2 based on optical character recognition[J].Journal of Tsinghua University(Science and Technology),2006,46(7):1247-1249.
Authors:SHANG Junqing  LIU Changsong  DING Xiaoqing
Abstract:Bi-level image coding is useful for document storage and archiving,image searches on the Internet and digital libraries.The JBIG2(joint bi-level image group) standard for lossless and lossy coding of bi-level images is a very flexible encoding strategy which allows researchers to design their own encoders. OCR processing of text images is one encoding technique that gives measurable recognition and the confidence results.We propose a lossy JBIG2 encoding method which uses OCR processing results to improve text image compression based on pattern matching.All the credible recognized characters in the image are replaced by representative character images so that the encoder only needs to mark the positions of these characters.Experiment results show that this method gives better results than previous JBIG2 encoding methods with 14.3% less storage compared to previous lossless methods while preserving relatively good text image quality.
Keywords:OCR
本文献已被 CNKI 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号