首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于CNN-CRF的中文电子病历命名实体识别研究
引用本文:曹依依,周应华,申发海,李智星.基于CNN-CRF的中文电子病历命名实体识别研究[J].重庆邮电大学学报(自然科学版),2019,31(6):869-875.
作者姓名:曹依依  周应华  申发海  李智星
作者单位:重庆邮电大学 计算机科学与技术学院,重庆400065;计算智能重庆市重点实验室,重庆400065
基金项目:国家自然科学基金(61502066)
摘    要:智慧医疗技术的发展让我们不满足仅使用传统方法做医学研究。针对中文电子病历实体识别问题,设计了一种基于卷积神经网络结合条件随机场(convolutional neural network-conditional random field,CNN-CRF)的实体识别算法框架。为得到高质量的词向量,将标注实体加入词典进行分词,并将已标注和未标注文本作为语料,用word2vec工具对已分词文本进行无监督学习;为避免扩张卷积层数增加导致过拟合,采用迭代扩张卷积处理输入向量,并使用dropout随机丢弃一些连接;运用条件随机场对网络的分类结果进行修正。把该方法在中文电子病历上进行对比试验,从病历中提取出身体部位,疾病,症状,检查及治疗5类实体。实验结果表明,该方法能有效地辨别病历中的实体,其识别的准确率、召回率和f1值分别为90.01%,90.62%,90.31%,准确率和速率比传统方法都有一定提高。

关 键 词:实体识别  中文电子病历  卷积神经网路  条件随机场
收稿时间:2018/6/28 0:00:00
修稿时间:2019/9/9 0:00:00

Research on named entity recognition of chinese electronic medical record based on CNN-CRF
CAO Yiyi,ZHOU Yinghu,SHEN Fahai and LI Zhixing.Research on named entity recognition of chinese electronic medical record based on CNN-CRF[J].Journal of Chongqing University of Posts and Telecommunications,2019,31(6):869-875.
Authors:CAO Yiyi  ZHOU Yinghu  SHEN Fahai and LI Zhixing
Abstract:The development of intelligent medical technology makes us not satisfied with only using traditional methods for medical research. For entity recognition of Chinese electronic medical record, an algorithm framework based on convolutional neural network and conditional random field (CNN-CRF) is designed in this paper. In order to obtain high-quality word vectors, labeled entities are added to the dictionary to segment words, and labeled and unlabeled texts are used as corpus. Unsupervised learning of segmented texts is carried out with word2vec tool. The increase of the dilated layers will cause over-fitting. The iterative dilated convolution is applied to train input vector, and some connections are randomly discarded by the dropout. Finally, the conditional random field revises the classification result. Based on the Chinese electronic medical records, the proposed method can extract body parts, diseases, symptoms, examination and treatment from the records. The experimental results show that our approach can effectively identify the entities, and the accuracy rate, the recall rate and the F1 value are 90.01%, 90.62%, and 90.31%. Compared with the traditional methods, the accuracy and speed of this approach are improved.
Keywords:Entity recognition  Chinese electronic medical record  Convolutional Neural Network  Conditional Random Field
本文献已被 万方数据 等数据库收录!
点击此处可从《重庆邮电大学学报(自然科学版)》浏览原始摘要信息
点击此处可从《重庆邮电大学学报(自然科学版)》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号