首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于CTC-GRU模型的长沙方言识别
引用本文:梁小林,沈湘菲,梁曌,邱海琳.基于CTC-GRU模型的长沙方言识别[J].吉首大学学报(自然科学版),2022,43(2):45-52.
作者姓名:梁小林  沈湘菲  梁曌  邱海琳
作者单位:(长沙理工大学数学与统计学院,湖南 长沙 410114)
基金项目:国家自然科学基金面上资助项目(61972055);湖南省教育厅重点项目(17A003,18A145)
摘    要:为了识别大词汇量下连续长沙话方言语音,提出了基于CTC算法的门控线性单元神经网络模型.先通过梅尔倒谱系数提取语音的特征参数,再把提取的特征参数输入门控线性单元神经网络,用CTC算法进行训练优化,得到输入序列整个的预测标签.最后在自建的长沙话方言语料库上,以词错率作为评价指标,对CTC模型、GRU模型和CTC-GRU模型进行对比,结果表明CTC-GRU模型相对于其他2个模型收敛速度更快,结果更精准.

关 键 词:CTC-GRU模型  梅尔倒谱系数  长沙话方言识别  词错率  

Changsha Dialect Recognition Based on CTC-GRU Model
LIANG Xiaolin,SHEN Xiangfei,LIANG Zhao,QIU Hailin.Changsha Dialect Recognition Based on CTC-GRU Model[J].Journal of Jishou University(Natural Science Edition),2022,43(2):45-52.
Authors:LIANG Xiaolin  SHEN Xiangfei  LIANG Zhao  QIU Hailin
Institution:(School of Mathematics and Statistics Science,Changsha University of Science and Technology,Changsha 410114,China)
Abstract:In order to recognize continuous speech in Changsha dialect with a large vocabulary,a gated linear element neural network model based on Connectionist Temporal Classification(CTC) algorithm is proposed.Firstly,the characteristic parameters of speech are extracted by Mel-scale Frequency Cepstral Coefficients(MFCC),and then the extracted characteristic parameters are input into gated linear unit neural network.CTC algorithm is used for training and optimization,and the whole prediction label of input sequence is obtained.Finally,the results of the CTC model,the GRU model and the CTC-GRU model are compared on the self-built corpus of Changsha dialect,and the Word Error Rate(WER) is taken as the evaluation index.The results show that the CTC-GRU model can achieve faster convergence and greater accuracy compared with the other two models.
Keywords:CTC-GRU model  MFCC  Changsha dialect recognition  WER  
点击此处可从《吉首大学学报(自然科学版)》浏览原始摘要信息
点击此处可从《吉首大学学报(自然科学版)》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号