首页 | 本学科首页   官方微博 | 高级检索  
     

基于编码器-解码器的离线手写数学公式识别
引用本文:杜永涛,余元辉. 基于编码器-解码器的离线手写数学公式识别[J]. 集美大学学报(自然科学版), 2022, 0(6): 570-576
作者姓名:杜永涛  余元辉
作者单位:(集美大学计算机工程学院,福建 厦门 361021)
摘    要:提出一种改进的编码器 解码器模型。模型采用多尺度密集卷积网络作为编码器,以提取手写数学公式图像的多分辨率特征。采用完全基于注意力机制的Transformer模型作为解码器,依据图像特征将二维手写数学公式解码为一维 LaTeX 序列。通过相对位置编码嵌入图像位置信息和LaTeX符号位置信息。实验结果表明,模型在官方CROHME 2014数据集上取得了优异的性能,相比于当前最先进的方法,其公式识别准确率提高了3.55%,字错误率降低了1.41%。

关 键 词:编码器 解码器  离线手写数学公式识别  多尺度密集卷积网络  Transformer模型  相对位置编码

Offline Handwritten Mathematical Expression Recognition Based on Encode-Decoder
DU Yongtao,YU Yuanhui. Offline Handwritten Mathematical Expression Recognition Based on Encode-Decoder[J]. the Editorial Board of Jimei University(Natural Science), 2022, 0(6): 570-576
Authors:DU Yongtao  YU Yuanhui
Affiliation:(College of Computer Engineering,Jimei University,Xiamen 361021,China)
Abstract:In recent years,great progress on handwritten mathematical expression recognition have been made by using Encoder-Decoder models.However,these Encoder-Decoder models still have two shortcomings.One is that the image feature information is insufficient by the encoder,and the other is that the decoder is inefficient in processing long sequences.For these shortcomings,this paper proposes an improved Encoder-Decoder model.The model uses a multi-scale Densely Connected Convolutional Networks as the encoder to extract the multi-resolution features of handwritten mathematical expressions images.By using a Transformer model based on the attention entirely as the decoder we decode two-dimensional handwritten mathematical expressions into one dimensional LaTeX sequences according to the image features.Hence,image position information and LaTeX symbol position information have been embedded by relative position encoding.The results show that the model achieves excellent performance on the official CROHME 2014 dataset,with a 355% improvement in formula recognition accuracy and a 141% reduction in word error rate compared to current state of the art methods.
Keywords:Encoder-Decoder  offline handwritten mathematical expression recognition  multi-scale Densely Connected Convolutional Networks  Transformer model  relative position encoding
点击此处可从《集美大学学报(自然科学版)》浏览原始摘要信息
点击此处可从《集美大学学报(自然科学版)》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号