首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于功率谱包络动态分割的鲁棒语音端点检测
引用本文:许春冬,王晶,战鸽,应冬文,李军锋,颜永红.基于功率谱包络动态分割的鲁棒语音端点检测[J].北京理工大学学报,2015,35(11):1189-1193.
作者姓名:许春冬  王晶  战鸽  应冬文  李军锋  颜永红
作者单位:北京理工大学信息与电子学院,北京 100081;江西理工大学信息工程学院,江西,赣州341000;中国科学院声学研究所语言声学与内容理解重点实验室,北京100190;北京理工大学信息与电子学院,北京,100081;中国科学院声学研究所语言声学与内容理解重点实验室,北京,100190
基金项目:国家重点基础研究发展计划项目资助(2013CB32930);国家自然科学基金资助项目(61271426,10925419,90920302,61072124,11074275,11161140319,91120001);中国科学院战略性先导科技专项基金资助项目(XDA06030100,XDA06030500);国家"八六三"计划项目(2012AA012503);中科院重点部署资助项目(KGZD-EW-103-2);江西理工大学科研基金资助项目(NSFJ2015-G21)
摘    要:在复杂的声学环境中,由于环境噪声的干扰,导致声学特征的稳定性不够理想.为克服此难题,通常对决策结果在时间维度上进行平滑.然而,这些平滑过程本身没有考虑数据在时间维度上的结构特征,属于启发式的方法.该文采用动态分割的方法,将语音的频谱包络在时间维度上分割成具有特征同一性的时间块,以分割块为单位计算能量特征,并进行语音/非语音决策,从而达到提高语音端点检测的稳定性目的.实验表明,提出的方法有效提高了语音端点检测的鲁棒性. 

关 键 词:语音端点检测  动态分割  聚类  最小描述长度准则
收稿时间:2014/11/2 0:00:00

Speech Endpoint Detection Based on the Dynamic Segmentation of Power Spectral Envelope
XU Chun-dong,WANG Jing,ZHANG Ge,YING Dong-wen,LI Jun-feng and YAN Yong-hong.Speech Endpoint Detection Based on the Dynamic Segmentation of Power Spectral Envelope[J].Journal of Beijing Institute of Technology(Natural Science Edition),2015,35(11):1189-1193.
Authors:XU Chun-dong  WANG Jing  ZHANG Ge  YING Dong-wen  LI Jun-feng and YAN Yong-hong
Institution:School of Information and Electronics, Beijing Institute of Technology, Beijing 100081, China;School of Information Engineering, Jiangxi University of Science and Technology, Ganzhou, Jiangxi 341000, China;Key Laboratory of Speech Acoustics and Content Understanding, Institute of Acoustics, Chinese Academy of Sciences, Beijing 100190, China,School of Information and Electronics, Beijing Institute of Technology, Beijing 100081, China,Key Laboratory of Speech Acoustics and Content Understanding, Institute of Acoustics, Chinese Academy of Sciences, Beijing 100190, China,Key Laboratory of Speech Acoustics and Content Understanding, Institute of Acoustics, Chinese Academy of Sciences, Beijing 100190, China,Key Laboratory of Speech Acoustics and Content Understanding, Institute of Acoustics, Chinese Academy of Sciences, Beijing 100190, China and Key Laboratory of Speech Acoustics and Content Understanding, Institute of Acoustics, Chinese Academy of Sciences, Beijing 100190, China
Abstract:The acoustic feature is not robust enough due to the interference of environmental noises. Some heuristic approaches of smoothing noisy spectra were introduced to treat with this problem. But those methods did not consider the intrinsic correlation in the time domain. This paper presents a novel method of endpoint detection, where the time sequence of logarithmic power was partitioned into homogeneous blocks using dynamic auto-segmentation. The acoustic feature was extracted from each homogenous block. The endpoint detection was conducted based on the unit of homogenous block. The experimental results showed the superiority of the proposed method.
Keywords:
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《北京理工大学学报》浏览原始摘要信息
点击此处可从《北京理工大学学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号