Peripheral Nonlinear Time Spectrum Features Algorithm for Large Vocabulary Mandarin Automatic Speech Recognition Peripheral Nonlinear Time Spectrum Features Algorithm for Large Vocabulary Mandarin Automatic Speech Recognition期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

Peripheral Nonlinear Time Spectrum Features Algorithm for Large Vocabulary Mandarin Automatic Speech Recognition

作者姓名：	Fadhil H.T.Al-dulaimy 王作英

作者单位：	Department of Electronic Engineering，Tsinghua University，Beijing 100084，China，Department of Electronic Engineering，Tsinghua University，Beijing 100084，China

基金项目：	Supported by the National High-Tech Research and Development (863) Program of China (No. 200/AA/14)

摘要：	IntroductionCurrentautomaticspeechrecognitionsystemsarebasedoncontext-dependentorcontext-independentphonicsorsyllablemodelsdescribedintermsofse-quencesofhiddenMarkovmodel(HMM)states,whereeachstateisassumedtobecharacterizedbyastationaryprobabilitydensityfunction.Thetimecorre-lationandconsequently,thesignaldynamicsinsideeachHMMstate,arealsousuallydisregardedalthoughtheuseofdynamicfeatures,suchasdeltaanddelta-deltaparameters,cancapturesomeofthecorrelations.Consequently,onlymedium-termdependenc…
关键词：	非线性时间光谱特征计算方法自动语音识别系统词汇
收稿时间：	1 September 2003
Peripheral Nonlinear Time Spectrum Features Algorithm for Large Vocabulary Mandarin Automatic Speech Recognition

Fadhil H. T. Al-dulaimy,WANG Zuoying.Peripheral Nonlinear Time Spectrum Features Algorithm for Large Vocabulary Mandarin Automatic Speech Recognition[J].Tsinghua Science and Technology,2005,10(2):174-182.

Authors:	Fadhil H T Al-dulaimy WANG Zuoying

Institution:	Fadhil H. T. Al-dulaimy,WANG Zuoying Department of Electronic Engineering,Tsinghua University,Beijing 100084,China

Abstract:	This work describes an improved feature extractor algorithm to extract the peripheral features of point x(t_i,f_j) using a nonlinear algorithm to compute the nonlinear time spectrum (NL-TS) pattern. The algorithm observes n×n neighborhoods of the point in all directions, and then incorporates the peripheral features using the Mel frequency cepstrum components (MFCCs)-based feature extractor of the Tsinghua electronic engineering speech processing (THEESP) for Mandarin automatic speech recognition (MASR) system as replacements of the dynamic features with different feature combinations. In this algorithm, the orthogonal bases are extracted directly from the speech data using discrite cosime transformation (DCT) with 3×3 blocks on an NL-TS pattern as the peripheral features. The new primal bases are then selected and simplified in the form of the Δ_dp-t operator in the time direction and the Δ_dp-f operator in the frequency direction. The algorithm has 23.29% improvements of the relative error rate in comparison with the standard MFCC feature-set and the dynamic features in tests using THEESP with the duration distribution-based hidden Markov model (DDBHMM) based on MASR system.

Keywords:	large vocabulary speech recognition Mandarin automatic speech recognition (MASR) dura- tion distribution-based hidden Markov model (DDBHMM) feature identification
本文献已被 CNKI 万方数据 ScienceDirect 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏