首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于神经耦合模型的异构词法数据转化和融合
引用本文:黄德朋,李正华,龚晨,张民.基于神经耦合模型的异构词法数据转化和融合[J].北京大学学报(自然科学版),2020,56(1):97-104.
作者姓名:黄德朋  李正华  龚晨  张民
作者单位:苏州大学计算机科学与技术学院, 苏州 215006
基金项目:国家自然科学基金(61525205, 61876116, 61702518)和江苏高校优势学科建设工程项目资助
摘    要:为了扩大人工标注数据的规模, 从而提高模型性能, 尝试充分利用已有的异构人工标注数据训练模型参数。将Li等2015年提出的耦合序列标注方法扩展到基于BiLSTM的深度学习框架, 直接在两个异构训练数据上训练参数, 测试阶段则同时预测两个标签序列。在词性标注、分词词性联合标注两个任务上进行大量实验, 结果表明, 与多任务学习方法和传统耦合模型相比, 神经耦合模型在利用词法异构数据方面更优越,在异构数据转化和融合两个场景上都取得更高的性能。

关 键 词:耦合模型  BiLSTM  深度学习  词性标注  分词  
收稿时间:2019-05-22

Neural Network Coupled Model for Conversion and Exploitation of Heterogeneous Lexical Annotations
HUANG Depeng,LI Zhenghua,GONG Chen,ZHANG Min.Neural Network Coupled Model for Conversion and Exploitation of Heterogeneous Lexical Annotations[J].Acta Scientiarum Naturalium Universitatis Pekinensis,2020,56(1):97-104.
Authors:HUANG Depeng  LI Zhenghua  GONG Chen  ZHANG Min
Institution:School of Computer Science and Technology, Soochow University, Suzhou 215006
Abstract:In order to expand the scale of manual annotated data and thereby improve model performance, we attempt to make full use of existing heterogeneous annotations to learn model parameters. We extend coupled sequence labeling model proposed by Li et al. (2015) under the BiLSTM-based deep learning framework. The neural coupled model learn its parameters directly on two heterogeneous training data, and predicts two optimal sequences simultaneously during the test phase. A lot of experiments have been conducted on the part-of-speech (POS) tagging task and the joint word segmentation and POS (WS&POS) tagging task. The results show that neural coupled approach is superior to other methods for exploiting heterogeneous lexical data, including the multi-task learning method and the traditional discrete-feature coupled model. Neural coupled model achieves higher performance on both scenarios, i.e., annotation conversion and boost the final target-side tagging accuracy by exploiting heterogeneous data.
Keywords:coupled model  BiLSTM  deep learning  part-of-speech tagging  word segmentation  
本文献已被 CNKI 等数据库收录!
点击此处可从《北京大学学报(自然科学版)》浏览原始摘要信息
点击此处可从《北京大学学报(自然科学版)》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号