首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于非负矩阵分解和长短时记忆网络的单通道语音分离
引用本文:崔建峰,邓泽平,申飞,史文武.基于非负矩阵分解和长短时记忆网络的单通道语音分离[J].科学技术与工程,2019,19(12).
作者姓名:崔建峰  邓泽平  申飞  史文武
作者单位:中北大学电子测试技术重点实验室,太原,030051;中北大学电子测试技术重点实验室,太原,030051;中北大学电子测试技术重点实验室,太原,030051;中北大学电子测试技术重点实验室,太原,030051
摘    要:为了解决语音分离中非负矩阵分解(non-negative matrix factorization,NMF)、深度神经网络(deep neural network,DNN)等算法没有考虑语音时序相关性的问题。结合NMF和长短时记忆网络(long short-term memory,LSTM)算法提出NMFLSTM单通道语音分离算法:将语音信号的幅度谱作为模型的输入特征,通过训练NMF和LSTM模型获得目标语音的基矩阵和系数矩阵,并对其结果进行语音重构最终实现语音分离。实验结果表明:相比于未考虑语音时间连续性的算法,使用NMFLSTM算法分离语音的客观语音质量评估值(perceptual evaluation of speech quality,PESQ)有明显提升,其最大值超过3. 1,获得良好的分离效果。

关 键 词:语音分离  幅度谱  非负矩阵分解  深度学习  长短时记忆网络
收稿时间:2018/10/17 0:00:00
修稿时间:2019/1/29 0:00:00

Single Channel Speech Separation Based on NMF and LSTM
Institution:North University of China
Abstract:At present, non-negative matrix factorization (NMF) and depth neural network (DNN) algorithms used in speech separation have the problem that these algorithms do not consider the correlation of speech sequence. Aiming at this problem, NMF-LSTM single channel speech separation algorithm was proposed by combining NMF and long short memory network (LSTM) algorithm. We use the amplitude spectrum of speech signal as input feature of the model and obtain the basis matrix and coefficient matrix of target speech by training NMF and LSTM models,and then use these results for speech reconstruction.Finally, speech separation is achieved. The result of this experiment shows that the Perceptual evaluation of speech quality value (PESQ) of speech separation using NMF-LSTM algorithm is significantly improved compared with the algorithm without considering speech time continuity,the maximum value is over 3.1, and a good separation effect is obtained.
Keywords:speech separation  the amplitude spectrum  non-negative matrix factorization  deep learning  LSTM
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《科学技术与工程》浏览原始摘要信息
点击此处可从《科学技术与工程》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号