一种基于幅度谱偏度的语音激活检测算法 A voice activity detection algorithm based on the spectral skewness期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

一种基于幅度谱偏度的语音激活检测算法

引用本文：	李强,陈丁当,舒勤军.一种基于幅度谱偏度的语音激活检测算法[J].重庆邮电大学学报(自然科学版),2015,27(6):728-734.

作者姓名：	李强陈丁当舒勤军

作者单位：	重庆邮电大学信号与信息处理重庆市重点实验室,重庆,400065

基金项目：	国家高技术研究发展计划（“863”计划）（2012AA01A508)

摘要：	根据语音信号偏离高斯分布程度大而背景噪声信号偏离高斯分布程度小这一特征,提出一种改进的以语音短时幅度谱偏度为特征参数区分语音段和噪声段的语音激活检测算法,并应用到2.4 kbit/s混合激励线性预测(mixed excitation linear prediction,MELP)声码器中.通过与自适应多速率(adaptive multi-rate,AMR)语音编码标准中的语音激活检测算法相比较,该算法复杂度较小,且对背景噪声服从高斯分布的语音信号具有更好的端点检测性能.实现了可变速率MELP声码器的平均输出码率下降为1.9 kbit/s,通过非连续传输后合成的语音具有良好的舒适性和连续性.
关键词：	幅度谱偏度语音激活检测变速率语音编码非连续传输
收稿时间：	2014/11/25 0:00:00
修稿时间：	2015/7/13 0:00:00
A voice activity detection algorithm based on the spectral skewness

LI Qiang,CHEN Dingdang and SHU Qinjun.A voice activity detection algorithm based on the spectral skewness[J].Journal of Chongqing University of Posts and Telecommunications,2015,27(6):728-734.

Authors:	LI Qiang CHEN Dingdang and SHU Qinjun

Institution:	Chongqing Key Laboratory of Signal and Information Processing, Chongqing University of Posts and Telecommunications,Chongqing 400065,P.R. China,Chongqing Key Laboratory of Signal and Information Processing, Chongqing University of Posts and Telecommunications,Chongqing 400065,P.R. China and Chongqing Key Laboratory of Signal and Information Processing, Chongqing University of Posts and Telecommunications,Chongqing 400065,P.R. China

Abstract:	According to the characteristic that the speech signals deviate the Gaussian distribution larger than the background noise signals, this paper proposes an improved voice activity detection algorithm which uses the short-time spectral skewness as a characteristic parameter to distinguish the speech segments and non-speech segments, and then applies it to the 2.4kb/s mixed excitation linear prediction vocoder. Compared to the voice activity detection algorithm in AMR, the voice activity detection algorithm in this paper has a lower complexity and a better detection effect to the speeches mixed with Gaussian noise. The average coding rate of the MELP variable rate vocoder is reduced to 1.9 kbit/s, and the synthesized speeches applied in the discontinuous transmission have good comfort and continuity in the subjective hearing.

Keywords:	spectral skewness voice activity detection variable rate speech coding discontinuous transmission
本文献已被万方数据等数据库收录！
	点击此处可从《重庆邮电大学学报(自然科学版)》浏览原始摘要信息
	点击此处可从《重庆邮电大学学报(自然科学版)》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏