首页 | 本学科首页   官方微博 | 高级检索  
     

基于改进Faster R-CNN的自然场景文字检测算法
引用本文:杨宏志,庞宇,王慧倩. 基于改进Faster R-CNN的自然场景文字检测算法[J]. 重庆邮电大学学报(自然科学版), 2019, 31(6): 876-884
作者姓名:杨宏志  庞宇  王慧倩
作者单位:重庆邮电大学 光电信息感测与信息传输实验室,重庆,400065
基金项目:国家自然科学基金(61301124,61471075,61671091);重庆科委自然科学基金(cstc2016jcyjA0347);重庆高校创新团队建设计划(智慧医疗系统与核心技术)
摘    要:自然场景中的文字受光照、污迹、文字较小等方面的影响,其检测难度较大,且传统的检测方法效果不好。在研究目标检测方法Faster RCNN的基础上,提出一种针对自然场景文字的改进方法。改进的模型由卷积神经网络特征提取模块,嵌套LSTM(nested long short-term memory,NLSTM)模块和区域候选网络(region proposal network,RPN)模块3部分组成,改进点主要是卷积神经网络特征提取模块增加了不同卷积层的空间特征融合,能够提取多层次的特征;增加嵌套LSTM模块能够学习长序列文本的序列特征,便于检测不定长度的文本序列;RPN模块通过设置宽为8像素,高度不定的锚点(anchor),可以提取一系列可能存在的目标建议框,其对小目标文字效果较好?。在实验部分,通过对标准数据集(ICDAR 2013,Multilingual)的实验结果对比表明,所提出的改进算法在准确率和效率方面明显优于改进前的算法。通过实列测试,改进的模型对小目标文字检测效果也有所提升。

关 键 词:区域候选网络  空间特征  长序列文本  建议框  准确率
收稿时间:2018-10-11
修稿时间:2019-10-08

Natural scene text detection algorithm based on improved Faster R-CNN
YANG Hongzhi,PANG Yu and WANG Huiqian. Natural scene text detection algorithm based on improved Faster R-CNN[J]. Journal of Chongqing University of Posts and Telecommunications, 2019, 31(6): 876-884
Authors:YANG Hongzhi  PANG Yu  WANG Huiqian
Affiliation:Photoelectronic Information Sensing and Transmission Technology Laboratory, Chongqing University of Posts and Telecommunications, Chongqing 400065, P. R. China,Photoelectronic Information Sensing and Transmission Technology Laboratory, Chongqing University of Posts and Telecommunications, Chongqing 400065, P. R. China and Photoelectronic Information Sensing and Transmission Technology Laboratory, Chongqing University of Posts and Telecommunications, Chongqing 400065, P. R. China
Abstract:The text in the natural scene is affected by illumination, smudges, and small text. The detection is difficult, and the traditional detection method is not effective. Based on the research target detection method Faster RCNN, this paper proposes an improved method of natural scene text. The improved model consists of a convolutional neural network feature extraction module, a nested long short-term memory (NLSTM) module and a region proposal network (RPN) module. The improvement point is mainly that the convolutional neural network feature extraction module increases the spatial feature fusion of different convolutional layers, and can extract multi-level features; the nested LSTM module can learn the sequence features of long-sequence text, and is convenient for detecting text sequence of indefinite length; the RPN module can extract a series of possible target suggestion boxes by setting an anchor with a width of 8 pixels and an indefinite height, which is better for small target text. In the experimental part, the comparison of the experimental results of the standard dataset (ICDAR 2013, Multilingual) shows that the proposed improved algorithm is superior to the pre-improvement algorithm in accuracy and efficiency. Through the actual test, the improved model also improves the detection of small target text.
Keywords:region proposal network   spatial features   long sequence text   suggestion boxes   accuracy
本文献已被 万方数据 等数据库收录!
点击此处可从《重庆邮电大学学报(自然科学版)》浏览原始摘要信息
点击此处可从《重庆邮电大学学报(自然科学版)》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号