首页 | 本学科首页   官方微博 | 高级检索  
     

对比增强的关联记忆网络用于医学图像报告生成
引用本文:王志强,曾宪华. 对比增强的关联记忆网络用于医学图像报告生成[J]. 重庆邮电大学学报(自然科学版), 2024, 36(3): 503-512
作者姓名:王志强  曾宪华
作者单位:重庆邮电大学 计算机科学与技术学院, 重庆 400065;图像认知重庆市重点实验室, 重庆 400065
基金项目:国家自然科学基金项目(62076044);重庆英才计划项目(cstc2022ycjh-bgzxm0160)
摘    要:在现有的医学影像诊断报告自动生成模型中,仅利用输入图像的视觉特征来提取相应的语义特征,并且生成词之间关联较弱和缺乏上下文信息等问题。为了解决上述问题,提出一种对比增强的关联记忆网络模型,通过对比学习提高模型区分不同图像的能力,设计了注意力增强关联记忆模块根据上一时间步生成的单词来持续更新,以加强生成医学图像报告中生成词之间的关联性,使得本模型可以为医学图像生成更准确的病理信息描述。在公开IU X-Ray数据集和私有胎儿心脏超声数据集上的实验结果表明,提出的模型在Cider评估指标方面明显优于以前的一些模型(与经典的AOANet模型相比较,在IU X-Ray上Cider指标提升了51.9%,在胎儿心脏超声数据集上Cider指标提升了3.0%)。

关 键 词:医学图像报告生成  关联记忆网络  双层LSTM  上下文信息  对比学习
收稿时间:2023-03-04
修稿时间:2024-03-20

Contrast-enhanced relational memory network for medical image report generation
WANG Zhiqiang,ZENG Xianhua. Contrast-enhanced relational memory network for medical image report generation[J]. Journal of Chongqing University of Posts and Telecommunications, 2024, 36(3): 503-512
Authors:WANG Zhiqiang  ZENG Xianhua
Affiliation:School of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Chongqing 400065, P. R. China;Chongqing Key Laboratory of Image Cognition, Chongqing 400065, P. R. China
Abstract:In existing models for automatic medical image diagnosis report generation, only visual features of the input images are used to extract the corresponding semantic features, and there exist the weak correlations and the lack of contextual information between generated words. To address the above problems, a contrast-enhanced relational memory network model is proposed to improve the model’s ability to distinguish different images through contrastive learning. An attention-enhanced associative memory module is designed to continuously update based on the words generated at the previous time step to enhance the correlation between generated words in medical image reports, making the model capable of generating more accurate pathological information descriptions for medical images. Experimental results on the public IU X-Ray dataset and private Fetal Ultrasound dataset show that the proposed model significantly outperforms previous models in terms of Cider evaluation metric (compared with the classical AOANet model, the Cider metric is increased by 51.9% on IU X-Ray and 3.0% on the Fetal Ultrasound dataset).
Keywords:medical image report generation  associative memory network  double LSTM  contextual information  contrastive learning
点击此处可从《重庆邮电大学学报(自然科学版)》浏览原始摘要信息
点击此处可从《重庆邮电大学学报(自然科学版)》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号