首页 | 本学科首页   官方微博 | 高级检索  
     

结合特征融合和金字塔注意力的场景文本检测
引用本文:冯宇静,贾世杰. 结合特征融合和金字塔注意力的场景文本检测[J]. 重庆邮电大学学报(自然科学版), 2022, 34(1): 110-116. DOI: 10.3979/j.issn.1673-825X.202002160046
作者姓名:冯宇静  贾世杰
作者单位:大连交通大学 电气信息工程学院,辽宁 大连116028
基金项目:辽宁省教育厅科学研究项目(JDL2019006)
摘    要:基于深度学习的场景文本检测普遍缺少特征级的精细化,导致原本设计良好的模型不能被充分利用,提出将特征融合和特征金字塔注意力模块应用到场景文本检测.将基本特征提取网络(PixelLink算法)得到的4个特征映射层以采样后加权叠加的方式进行特征融合,并将结果送给特征金字塔注意力模块.特征融合使各层级的特征信息相结合,从而增加...

关 键 词:特征融合  特征金字塔注意力模块  自然场景文本检测  PixelLink  ICDAR2015  ICDAR2013
收稿时间:2020-02-16
修稿时间:2021-11-05

Natural scene text detection based on pyramid attention network and feature fusion
FENG Yujing,JIA Shijie. Natural scene text detection based on pyramid attention network and feature fusion[J]. Journal of Chongqing University of Posts and Telecommunications, 2022, 34(1): 110-116. DOI: 10.3979/j.issn.1673-825X.202002160046
Authors:FENG Yujing  JIA Shijie
Affiliation:College of Electrical Information Engineering, Dalian Jiaotong University, Dalian 116028, P. R. China
Abstract:At present, text detection in natural scenes based on deep learning generally lacks the refinement of feature level, which results in the fact that the well-designed models cannot be fully utilized. In order to solve the above problem, the combination of feature fusion and feature pyramid attention module are proposed to implement the natural scene text detection. The four feature mapping layers obtained from the basic feature extraction network (PixelLink algorithm) are fused by means of using weighted-overlap after sampling, and sent to the feature pyramid attention module. The feature fusion module combines feature information of each level to increase the amount of information in the feature map layer. The attention network can expand the receptive field without more computing power, and the spatial pyramid structure employs different grid scales or expansion rates to fuse the multi-scale feature information. The feature pyramid attention module includes three branches:the refined pyramid network, the nonlinear transformation and the global average pooling. Compared with the PixelLink algorithm, our algorithm achieves F-measure improvement of 2.91% and 4.04% on ICDAR2015 and ICDAR2013, respectively.
Keywords:feature fusion  feature pyramid attention module  natural scene text detection  PixelLink  ICDAR2015  ICDAR2013
本文献已被 万方数据 等数据库收录!
点击此处可从《重庆邮电大学学报(自然科学版)》浏览原始摘要信息
点击此处可从《重庆邮电大学学报(自然科学版)》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号