首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于深度学习的威胁情报领域命名实体识别
引用本文:王瀛,王泽浩,李红,黄文军.基于深度学习的威胁情报领域命名实体识别[J].东北大学学报(自然科学版),2023,44(1):33-39.
作者姓名:王瀛  王泽浩  李红  黄文军
作者单位:(1.河南大学 河南省智能网络理论与关键技术国际联合实验室, 河南 开封475001; 2.河南大学 河南省高等学校学科创新引智基地-河南大学软件工程智能信息处理创新引智基地, 河南 开封475001; 3.河南大学 智能网络系统研究所, 河南 开封475001; 4.中国科学院 信息工程研究所, 北京100049)
基金项目:河南省自然科学基金资助项目(182300410164); 河南大学研究生教育创新与质量提升计划项目——英才计划(No.SYL19060120); 国家自然科学基金青年基金资助项目(61702503,61802016); 国家自然科学基金重点资助项目(Y810021104).
摘    要:为了从来源不同的威胁情报中提取关键信息,方便政府监管部门开展安全风险评估,针对威胁情报文本中英文混杂严重以及专业词汇生僻导致识别困难的问题,在BiGRU-CRF模型基础上,提出了一种融合边界特征以及迭代膨胀卷积神经网络(IDCNN)的威胁情报命名实体识别方法.该方法根据人工构造的规则词典将边界清晰的实体例如英文单词进行转化以减少模型在处理较长文本时容易造成的信息损失,通过IDCNN和双向门控循环单元(BiGRU)进一步提取了文本的局部和全局特征.通过在威胁情报语料库上进行实验,结果表明所提的方法模型在相关评价指标上均优于其他模型,F值达到87.4%.

关 键 词:威胁情报  膨胀卷积  命名实体识别    信息抽取  深度学习  
修稿时间:2021-10-09

Named Entity Recognition in Threat Intelligence Domain Based on Deep Learning
WANG Ying,WANG Ze-hao,LI Hong,HUANG Wen-jun.Named Entity Recognition in Threat Intelligence Domain Based on Deep Learning[J].Journal of Northeastern University(Natural Science),2023,44(1):33-39.
Authors:WANG Ying  WANG Ze-hao  LI Hong  HUANG Wen-jun
Abstract:In order to extract key information of threat intelligence from different sources and facilitate the government regulatory authorities to carry out security risk assessment, to reduce the difficulty identification caused by the serious mixing of Chinese and English threat intelligence texts and the lack of professional vocabulary, based on BiGRU-CRF model, a threat intelligence named entity recognition(NER)method integrating boundary features and iterated dilated convolution neural network (IDCNN) is proposed. Firstly, entities with clear boundaries, such as English words, are transformed according to the artificially constructed rule dictionary to reduce the loss of information easily caused by the model when processing long texts. The local feature information and the context global feature information are obtained through IDCNN and bidirectional gated recurrent unit (BiGRU), respectively. The results of experiments on threat intelligence corpus show that the proposed model is better than other models in relevant evaluation indexes, and the F-score reaches 87.4%.
Keywords:threat intelligence  dilated convolution  named entity recognition (NER)  information extraction  deep learning  
点击此处可从《东北大学学报(自然科学版)》浏览原始摘要信息
点击此处可从《东北大学学报(自然科学版)》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号