语谱特征的身份认证向量识别方法 An i-vector speaker recognition method based on spectrogram期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

语谱特征的身份认证向量识别方法

引用本文：	冯辉宗,王芸芳.语谱特征的身份认证向量识别方法[J].重庆大学学报(自然科学版),2017,40(5):88-94.

作者姓名：	冯辉宗王芸芳

作者单位：	重庆邮电大学汽车电子实验室,重庆,400069

基金项目：	重庆市教育成果转化基金资助项目（KJZH14207）。

摘要：	针对采用梅尔频率倒谱系数(mel-frequency cepstrum coefficient,MFCC)作为身份认证向量(identity vector,i-vector)进行说话人识别存在语音信息不全的问题,提出一种基于语谱特征的身份认证向量识别说话人的方法。语音信号经过预加重、分帧加窗预处理之后,通过短时傅立叶变换转换成语谱图,语谱图被提交到高斯通用背景模型,在高维均值超向量空间中选择合适的低维线性子空间流型结构以构造符合正态分布的向量作为身份认证向量。这些获取的身份认证向量经过线性判别性分析实现降维并存储。最后采用对数似然比(log-likelihood ratio,LLR)方法对训练和测试阶段的i-vector进行评分,完成说话人识别。以TIMIT数据库为标准的数值实验结果表明,相比采用MFCC作为特征的识别方法,研究的等错误率(equal error rate,EER)更低。
关键词：	语谱图身份认证向量说话人识别
收稿时间：	2016/10/21 0:00:00
An i-vector speaker recognition method based on spectrogram

FENG Huizong and WANG Yunfang.An i-vector speaker recognition method based on spectrogram[J].Journal of Chongqing University(Natural Science Edition),2017,40(5):88-94.

Authors:	FENG Huizong and WANG Yunfang

Institution:	Automotive Electronics Lab, Chongqing University of Posts and Telecommunications, Chongqing 400069, P. R. China and Automotive Electronics Lab, Chongqing University of Posts and Telecommunications, Chongqing 400069, P. R. China

Abstract:	An i-vector speaker recognition method using spectral features was proposed to solve the problem that there is always insufficient information when the mel-frequency cepstrum coefficients (MFCC) are used as feature vectors of i-vectors. Specifically, the speech signals are pre-emphasized, framed and windowed first, and then fed to the short-time Fourier transform to obtain spectrogram. These spectrograms are submitted into Gaussian universal background model for constructing the i-vectors in an appropriate low-dimensional linear subspace flow pattern. These vectors are conformed to normal distribution and reduced by linear discriminant analysis. Finally, Log-likelihood ratio (LLR) method is used for marking i-vectors in training and testing stage to complete the speaker recognition. Standard numerical experiment results with TIMIT database show that compared with recognition method using MFCC as features, the EER(equal error rate) of the method in this paper is lower.

Keywords:	spectrogram identity vector speaker recognition
本文献已被 CNKI 万方数据等数据库收录！
	点击此处可从《重庆大学学报(自然科学版)》浏览原始摘要信息
	点击此处可从《重庆大学学报(自然科学版)》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏