首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于矩阵线性插值的说话人自适应算法
引用本文:吕萍,王作英,陆大.基于矩阵线性插值的说话人自适应算法[J].清华大学学报(自然科学版),2002,42(1):26-29.
作者姓名:吕萍  王作英  陆大
作者单位:清华大学,电子工程系,北京,100084
基金项目:清华大学“九八五”重大项目 ( 985校 -2 2 -攻关 -0 6 )
摘    要:语音识别技术中说话人快速自适应技术受到普遍关注。最大似然模型插值 (maxim um likelihood model inter-polation,ML MI)算法是一种有效的快速自适应算法 ,它的主要缺点是需要存储大量的特定人模型。为克服这一缺点 ,该文提出一种改进方法——矩阵线性插值自适应算法。该算法用表示说话人特性的矩阵代替 ML MI中的特定人模型进行线性插值。而插值系数由测试者提供的语音数据按照最大似然准则确定。插值后的线性矩阵与非特定人模型相作用得到最终的说话人自适应模型。该算法大大减少了计算存储量 ,且自适应性能基本与 ML MI相当

关 键 词:连续语音识别  说话人自适应  最大似然模型插值  矩阵线性插值
文章编号:1000-0054(2002)01-0026-04
修稿时间:2001年2月8日

Speaker adaptation algorithm based on linear matrix interpolation
LU Ping,WANG Zuoying,LU Dajin.Speaker adaptation algorithm based on linear matrix interpolation[J].Journal of Tsinghua University(Science and Technology),2002,42(1):26-29.
Authors:LU Ping  WANG Zuoying  LU Dajin
Abstract:Fast speaker adaptation techniques for speech recognition are of great interest. A fast speaker adaptation method named the maximum likelihood model interpolation (MLMI) has been developed as an effective speaker adoptation method. The main shortcoming of MLMI is the large memory need to store speaker dependent (SD) models. A modified method, the matrix linear interpolation adaptation method, is proposed in this paper to overcome the memory limitation. This method uses matrix instead of the SD model used in MLMI to represent the speaker characteristics. An estimated interpolation coefficient maximizes the likelihood of the adaptation data. The interpolated matrix is then used to transform the speaker independent model to the speaker adapted model. This method greatly reduces the memory requirement while maintaining the adaptation performance of MLMI.
Keywords:continuous  speech recognition  speaker adaptation  maximum likelihood model interpolation  matrix linear interpolation
本文献已被 CNKI 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号