首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于正弦模型的语音识别时频特征
引用本文:邢艳玲,杨吉斌,张雄伟.基于正弦模型的语音识别时频特征[J].解放军理工大学学报,2004,5(1):22-25.
作者姓名:邢艳玲  杨吉斌  张雄伟
作者单位:解放军理工大学,通信工程学院,江苏,南京,210007;解放军理工大学,通信工程学院,江苏,南京,210007;解放军理工大学,通信工程学院,江苏,南京,210007
摘    要:为改善语音识别系统的性能,采用时频分布参数来描述语音特征。由于时频分布参数考虑到语音信号内在的非平稳特性,因此能够更准确地描述语音信号的时频特性。对基于正弦模型的多种时频参数(能量谱和幅度加权瞬时频谱)进行了比较,并在基于隐马尔可夫模型的连接词语音识别系统中进行了实验仿真。结果表明,单独采用时频分布参数作为ASR的前端特征并不能改善识别率;而采用标准ASR特征和能量谱时频特征的联合前端特征,可以有效地改善语音识别系统的识别效果。

关 键 词:语音识别  语音前端特征  时频分布  正弦模型  能量谱
文章编号:1009-3443(2004)01-0022-04
修稿时间:2003年4月18日

Time-frequency Features for Speech Recognition Based on Sinusoidal Speech Model
XING Yan-ling,YANG Ji-bin and ZHANG Xiong-wei.Time-frequency Features for Speech Recognition Based on Sinusoidal Speech Model[J].Journal of PLA University of Science and Technology(Natural Science Edition),2004,5(1):22-25.
Authors:XING Yan-ling  YANG Ji-bin and ZHANG Xiong-wei
Institution:Institute of Communications Engineering,PLA Univ.of Sci.& Tech.,Nanjing 210007,China;Institute of Communications Engineering,PLA Univ.of Sci.& Tech.,Nanjing 210007,China;Institute of Communications Engineering,PLA Univ.of Sci.& Tech.,Nanjing 210007,China
Abstract:The use of time-frequency distribution as features for automatic speech recognition (ASR) is intended to improve the performance of ASR. Because of the inherent non-stationary characteristic of speech signal, time-frequency distribution can describe the signal more precisely. Time-frequency features,such as energy spectrum and amplitude weighted short-time average of the instantaneous frequency,are compared based on the sinusoidal speech model, and some experiments are carried out in connected digits recognition system based on Hidden Markov Models. The results indicate that time-frequency features as front-end can't improve the performance of ASR solely,while the feature which combines standard ASR features and time-frequency features based on energy spectrum can improve the performance of ASR system effectively.
Keywords:automatic speech recognition  front-end features  time-frequency distribution  sinusoidal speech model  energy spectrum
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《解放军理工大学学报》浏览原始摘要信息
点击此处可从《解放军理工大学学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号