首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于2D-Haar声学特征的大规模说话人识别方法
引用本文:谢尔曼,罗森林,潘丽敏.基于2D-Haar声学特征的大规模说话人识别方法[J].北京理工大学学报,2014,34(11):1196-1201.
作者姓名:谢尔曼  罗森林  潘丽敏
作者单位:北京理工大学信息系统及安全对抗实验中心,北京 100081;北京理工大学信息系统及安全对抗实验中心,北京 100081;北京理工大学信息系统及安全对抗实验中心,北京 100081
基金项目:国家242计划基金资助项目(2005C48);北京理工大学科技创新计划(2011CX01015)
摘    要:随着待识别人数的增加,文本无关的说话人识别准确率下降明显. 针对这一问题提出了一种高准确率大规模说话人识别方法,该方法采用多个连续音频帧的声学帧特征构成声学特征图,进而获得高维度的2D-Haar声学特征,为训练出性能更优的分类器提供可能;再利用AdaBoost.MH算法筛选出具有较好区分度的2D-Haar声学特征组合进行分类器训练. 实验结果表明,600人规模下的正确识别率为89.5%,100~600人规模下的平均准确率为91.3%. 该方法适用于大规模说话人的识别,引入的2D-Haar声学特征有效,识别准确率高. 此外,该方法还具有较低的算法复杂度和较高的时间效率. 

关 键 词:说话人识别  2D-Haar声学特征  AdaBoost.MH
收稿时间:7/3/2013 12:00:00 AM

Large Scale Speaker Recognition Method that Uses 2D-Haar Acoustic Feature
XIE Er-man,LUO Sen-lin and PAN Li-min.Large Scale Speaker Recognition Method that Uses 2D-Haar Acoustic Feature[J].Journal of Beijing Institute of Technology(Natural Science Edition),2014,34(11):1196-1201.
Authors:XIE Er-man  LUO Sen-lin and PAN Li-min
Institution:Information System and Security and Countermeasures Experimental Center, Beijing Institute of Technology, Beijing 100081, China
Abstract:When we use the text-independent speaker recognition technology, the recognition accuracy degrades significantly as the number of target speakers increases. In order to improve the accuracy,a high accuracy large-scale speaker recognition method was proposed. This method combined certain number of continuous audio frames to be an acoustic feature figure, and then got the high-dimension 2D-Haar acoustic feature, which provide more probabilities to train a better classifier; AdaBoost.MH algorithm was employed to find out efficient 2D-Haar acoustic feature combination for classifier training. The experimental results show that recognition rate is 89.5% when the number of target speakers is 600, and average rate is 91.3% when the number of target speakers increases from 100 to 600. This method is suitable for large-scale speaker recognition and 2D-Haar acoustic feature is effective to yield higher performance. In addition, this method also has low algorithm complexity and time consumption.
Keywords:speaker recognition  2D-Haar acoustic feature  AdaBoost  MH
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《北京理工大学学报》浏览原始摘要信息
点击此处可从《北京理工大学学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号