结合特征融合和金字塔注意力的场景文本检测 Natural scene text detection based on pyramid attention network and feature fusion期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

结合特征融合和金字塔注意力的场景文本检测

引用本文：	冯宇静,贾世杰. 结合特征融合和金字塔注意力的场景文本检测[J]. 重庆邮电大学学报(自然科学版), 2022, 34(1): 110-116. DOI: 10.3979/j.issn.1673-825X.202002160046

作者姓名：	冯宇静贾世杰

作者单位：	大连交通大学电气信息工程学院,辽宁大连116028

基金项目：	辽宁省教育厅科学研究项目（JDL2019006）

摘要：	基于深度学习的场景文本检测普遍缺少特征级的精细化,导致原本设计良好的模型不能被充分利用,提出将特征融合和特征金字塔注意力模块应用到场景文本检测.将基本特征提取网络(PixelLink算法)得到的4个特征映射层以采样后加权叠加的方式进行特征融合,并将结果送给特征金字塔注意力模块.特征融合使各层级的特征信息相结合,从而增加...
关键词：	特征融合特征金字塔注意力模块自然场景文本检测 PixelLink ICDAR2015 ICDAR2013
收稿时间：	2020-02-16
修稿时间：	2021-11-05
Natural scene text detection based on pyramid attention network and feature fusion

FENG Yujing,JIA Shijie. Natural scene text detection based on pyramid attention network and feature fusion[J]. Journal of Chongqing University of Posts and Telecommunications, 2022, 34(1): 110-116. DOI: 10.3979/j.issn.1673-825X.202002160046

Authors:	FENG Yujing JIA Shijie

Affiliation:	College of Electrical Information Engineering, Dalian Jiaotong University, Dalian 116028, P. R. China

Abstract:	At present, text detection in natural scenes based on deep learning generally lacks the refinement of feature level, which results in the fact that the well-designed models cannot be fully utilized. In order to solve the above problem, the combination of feature fusion and feature pyramid attention module are proposed to implement the natural scene text detection. The four feature mapping layers obtained from the basic feature extraction network (PixelLink algorithm) are fused by means of using weighted-overlap after sampling, and sent to the feature pyramid attention module. The feature fusion module combines feature information of each level to increase the amount of information in the feature map layer. The attention network can expand the receptive field without more computing power, and the spatial pyramid structure employs different grid scales or expansion rates to fuse the multi-scale feature information. The feature pyramid attention module includes three branches:the refined pyramid network, the nonlinear transformation and the global average pooling. Compared with the PixelLink algorithm, our algorithm achieves F-measure improvement of 2.91% and 4.04% on ICDAR2015 and ICDAR2013, respectively.

Keywords:	feature fusion feature pyramid attention module natural scene text detection PixelLink ICDAR2015 ICDAR2013
本文献已被万方数据等数据库收录！
	点击此处可从《重庆邮电大学学报(自然科学版)》浏览原始摘要信息
	点击此处可从《重庆邮电大学学报(自然科学版)》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏