期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

动态阈值谱法语音增强 总被引：2，自引：0，他引：2

陆生礼余崇智《南京大学学报(自然科学版)》1996,32(2):218-223

根据人耳能从噪声中提取有用信息的听觉特征，并结合语音信号的基本特征，提出并研究了一个适合于语音增强的听党内模型；实验结果表明，这个方法不仅在提高语音信噪比方面，而且在减小语音失真度方面均有较好的改善。相似文献

2.

Efficient auditory coding 总被引：2，自引：0，他引：2

Smith EC Lewicki MS 《Nature》2006,439(7079):978-982

The auditory neural code must serve a wide range of auditory tasks that require great sensitivity in time and frequency and be effective over the diverse array of sounds present in natural acoustic environments. It has been suggested that sensory systems might have evolved highly efficient coding strategies to maximize the information conveyed to the brain while minimizing the required energy and neural resources. Here we show that, for natural sounds, the complete acoustic waveform can be represented efficiently with a nonlinear model based on a population spike code. In this model, idealized spikes encode the precise temporal positions and magnitudes of underlying acoustic features. We find that when the features are optimized for coding either natural sounds or speech, they show striking similarities to time-domain cochlear filter estimates, have a frequency-bandwidth dependence similar to that of auditory nerve fibres, and yield significantly greater coding efficiency than conventional signal representations. These results indicate that the auditory code might approach an information theoretic optimum and that the acoustic structure of speech might be adapted to the coding capacity of the mammalian auditory system. 相似文献

3.

语音信号中相位信息的听觉感知研究 总被引：4，自引：0，他引：4

同鸣卞正中戴启军陈砚圃张亮《西安交通大学学报》2003,37(12):1288-1291,1307

通过主观听觉测试实验，研究了语音信号中相位信息对人的听觉感知的影响．实验结果表明，保持语音信号的幅度谱不变，在改变其相位谱时，只要重建信号在时域中的包络不变，重建语音和原始语音就不存在主观听觉上的差异．重建语音的听觉感知效果主要取决于附加相位对频率的导数的起伏幅度．重建语音中不同频率分量之间的最大相对时移决定语音感知的质量，当最大相对时移小于10ms时，语音感知质量最优；只要相位失真带来的不同频率分量之间的最大相对时移小于20ms，就不会影响对连续语音的正常理解．相似文献

4.

基于听觉模型的小波包变换的语音增强 总被引：8，自引：0，他引：8

王炜杨道淳方元徐柏龄《南京大学学报(自然科学版)》2001,(5)

由于人耳频率分辨率是非线性的 ,用传统的线性信号处理方法 (如FFT)来模拟人耳基底膜的频率分析特性是比较困难的 .小波包算法有灵活的时频分析能力 ,可较好地符合人耳基底膜的频率分析特性 .在模拟人耳的听觉机理方面 ,用动态阈值法成功地对含噪语音进行了去噪处理 ,在去噪处理中引入音乐噪声的问题也较好地得到解决 .实验表明 :在单声道的条件下 ,其语音增强效果比传统的频谱减法有更高的清晰度和可懂度相似文献

5.

Complex auditory behaviour emerges from simple reactive steering

Hedwig B Poulet JF 《Nature》2004,430(7001):781-785

The recognition and localization of sound signals is fundamental to acoustic communication. Complex neural mechanisms are thought to underlie the processing of species-specific sound patterns even in animals with simple auditory pathways. In female crickets, which orient towards the male's calling song, current models propose pattern recognition mechanisms based on the temporal structure of the song. Furthermore, it is thought that localization is achieved by comparing the output of the left and right recognition networks, which then directs the female to the pattern that most closely resembles the species-specific song. Here we show, using a highly sensitive method for measuring the movements of female crickets, that when walking and flying each sound pulse of the communication signal releases a rapid steering response. Thus auditory orientation emerges from reactive motor responses to individual sound pulses. Although the reactive motor responses are not based on the song structure, a pattern recognition process may modulate the gain of the responses on a longer timescale. These findings are relevant to concepts of insect auditory behaviour and to the development of biologically inspired robots performing cricket-like auditory orientation. 相似文献

6.

融合SS、MFCC和PMC技术的语音去噪方法

丁冬冬佘玉梅江涛庄丽王米利刘敬凤《云南民族大学学报(自然科学版)》2014,(3):232-234

为提高语音识别系统在噪音情况下的识别率,提出了一种融合信号级去噪、参数级去噪、模型级去噪的方法.首先用谱减法对带噪的语音信号进行去噪,再利用Mel倒谱系数(MFCC)对处理后的语音信号进行特征提取,最后经过并行模型结合处理法(PMC)处理得到较高识别率的语音信号. 相似文献

7.

Mel 频率下基于 LPC 的语音信号深度特征提取算法 总被引：1，自引：0，他引：1

罗元吴承军张毅黎小松席兵《重庆邮电大学学报(自然科学版)》2016,28(2):174-179

针对传统语音信号二次特征提取方法在保证识别率的前提下,实时性较差的问题,提出一种Mel频率下基于线性预测系数(linear predictive coefficient,LPC)的改进的语音信号深度特征提取算法.该方法根据人耳的听觉特性把LPC在Mel频率下进行非线性变换,再进行微分、高阶微分和按比例重组等步骤,得到一种既考虑声道激励又兼顾人耳听觉的新特征参数,从而大大减少传统语音信号深度特征提取的计算量,在不影响识别效率的情况下,极大提高系统的实时性.最后,将该算法在智能轮椅平台进行有效性验证,大量实验表明,语音控制系统实时性差的问题在使用该算法后能够得到明显改善,该算法既保证了特征提取识别率,也有效地改善了系统的实时性.在一定程度上使语音控制智能轮椅更具实用性. 相似文献

8.

A Novel Visualization Tool for Manual Annotation when Building Large Speech Corpora

SHE Kun CHEN Shuzhen YANG Shen ZOU Lian 《武汉大学学报:自然科学英文版》2006,11(2):381-384

相似文献

9.

汉语数码语音识别中一种新的抗噪声特征参数 总被引：1，自引：1，他引：0

张涛郜彦华《河南科技大学学报(自然科学版)》2005,26(3):46-48

为了提高中小词汇量语音识别系统在噪声环境下的识别性能,以10个汉语数码语音为对象,利用汉语数码语音信号区别于噪声信号的准周期特性,提出了一种汉语数码语音频谱包络峰值特性的提取方法,首先用基频对语音频谱采样得到由谐波值构成的包络以提高信噪比,然后再对所得包络进行峰值提取得到汉语数码语音的峰值特征。实验结果表明,在信噪比大于5dB时,用该方法得到的峰值特征具有一定的抗噪性。相似文献

10.

基于声压测量的结构模态参数识别研究

郑佳艳刘年周志祥余忠儒唐俊义邓国军《科学技术与工程》2021,21(30):13116-13122

为克服接触式测量传感器附加质量的影响,本文采用非接触式声压测量获取结构近场的辐射声压信号,提出Hilbert-Huang变换二次滤波时频分析方法对非平稳的近场声压辐射信号进行分析,实现了在强噪声环境中对桥梁结构模态参数的识别。在实验室对一跨径为11.2m的H型简支钢梁开展试验,结果表明：该方法能准确识别钢梁的前3阶固有频率,平均误差在0.5%以内,并成功获取结构的模态振型。该方法为桥梁结构模态参数获取和结构健康监测提供新的手段。相似文献

11.

基于矢量量化的语音信号频带扩展

郎玥赵胜辉匡镜明《北京理工大学学报》2005,25(3):260-264

对基于矢量量化的频带扩展方法进行了改进.在码本形成上提出了重新量化的方法,并用码本结合浊音度的方法调整增益.首先根据清浊度和能量被划分为标准将窄带输入信号清音、浊音和静音3类;然后每类信号选择不同的码本,用基于矢量量化的方法将窄带信号的谱包络转换成高频带信号的谱包络;再用激励信号(高斯白噪声信号)和重建的高频谱包络合成高频带语音;最后将高频带与原窄带信号之和作为最终的宽带信号.仿真及与其他方法比较说明,本文的方法所需计算量小,适合实时环境. 相似文献

12.

基于加权组合过零峰值幅度特征的抗噪语音识别

梁五洲张雪英《太原理工大学学报》2006,37(1):84-86

基于人耳听觉特性提出一种新的抗噪音识别特征:加权组合过零峰值幅度特征,是对过零峰值幅度特征的一种改进。加权组合过零峰值幅度特征以语音数据和差分语音数据作为处理对象,通过计算它们的上升过零率获得频率信息,经幅度非线性压缩获得密度信息,并根据人耳对声音的感知特点对其进行加权,形成最终的输出特征,识别网络使用HMM。仿真实现了使用新特征与原特征的算法识别结果,证明了新特征具有较高的识别率和优良的抗噪性能。相似文献

13.

Cortical remodelling induced by activity of ventral tegmental dopamine neurons. 总被引：15，自引：0，他引：15

S Bao V T Chan M M Merzenich 《Nature》2001,412(6842):79-83

Representations of sensory stimuli in the cerebral cortex can undergo progressive remodelling according to the behavioural importance of the stimuli. The cortex receives widespread projections from dopamine neurons in the ventral tegmental area (VTA), which are activated by new stimuli or unpredicted rewards, and are believed to provide a reinforcement signal for such learning-related cortical reorganization. In the primary auditory cortex (AI) dopamine release has been observed during auditory learning that remodels the sound-frequency representations. Furthermore, dopamine modulates long-term potentiation, a putative cellular mechanism underlying plasticity. Here we show that stimulating the VTA together with an auditory stimulus of a particular tone increases the cortical area and selectivity of the neural responses to that sound stimulus in AI. Conversely, the AI representations of nearby sound frequencies are selectively decreased. Strong, sharply tuned responses to the paired tones also emerge in a second cortical area, whereas the same stimuli evoke only poor or non-selective responses in this second cortical field in naive animals. In addition, we found that strong long-range coherence of neuronal discharge emerges between AI and this secondary auditory cortical area. 相似文献

14.

Better speech recognition with cochlear implants. 总被引：34，自引：0，他引：34

B S Wilson C C Finley D T Lawson R D Wolford D K Eddington W M Rabinowitz 《Nature》1991,352(6332):236-238

HIGH levels of speech recognition have been achieved with a new sound processing strategy for multielectrode cochlear implants. A cochlear implant system consists of one or more implanted electrodes for direct electrical activation of the auditory nerve, an external speech processor that transforms a microphone input into stimuli for each electrode, and a transcutaneous (rf-link) or percutaneous (direct) connection between the processor and the electrodes. We report here the comparison of the new strategy and a standard clinical processor. The standard compressed analogue (CA) processor presented analogue waveforms simultaneously to all electrodes, whereas the new continuous interleaved sampling (CIS) strategy presented brief pulses to each electrode in a nonoverlapping sequence. Seven experienced implant users, selected for their excellent performance with the CA processor, participated as subjects. The new strategy produced large improvements in the scores of speech reception tests for all subjects. These results have important implications for the treatment of deafness and for minimal representations of speech at the auditory periphery. 相似文献

15.

基于TDM的多通道声卡设计

周旺姜弢《应用科技》2010,37(10):31-35

为了提高航行参数记录仪(VDR)声卡利用率、减少繁琐操作流程、增强系统数据安全性,基于TDM(time-division multiplexing)方法、音频数字信号处理PCI(peripheral component interconnect)总线多通道数据传输技术,应用DSPC54与AMBE-3000的总线连接方式,利用DSP高速并行处理数据特点及AMBE-3000语音编解码的低速码率、全双工优点,提出具有实时多路语音数据处理功能、低系统复杂度及操作简易等优点的船载VDR声卡设计方案,将VDR声卡的单卡多插槽结构改为分时多通道并行结构,客观失真度测试及仿真测试显示其合成语音在共振峰和基音周期结构上与原始语音一致,合成语音具有比较好的可懂度. 相似文献

16.

FCNN深度学习模型及其在动物语音识别中的应用

石鑫鑫鱼昕刘铭《吉林大学学报(信息科学版)》2021,39(1):60-65

为解决使用语音信号准确识别动物以保护和研究野生动物的问题,提出一种全连接算法与稀疏连接算法相结合的全卷积神经网络(FCNN: Fully Convolutional Neural Network),用于语音的自动识别.利用全连接算法提取更多的组合特征,稀疏连接算法筛选重要特征可加快收敛速度.同时给出了具体的模型结构及算... 相似文献

17.

基于听觉感知特征的环境声分类

刘芹《科学技术与工程》2013,13(21):6107-6110,6133

大多数环境声是不和谐的,比语声和乐声更加不平稳。针对传统时频分析的不足,提出一种基于听觉感知的环境声特征提取方法。针对研究的小样本问题,采用支持向量机(Support Vector Machine,SVM)作为分类算法,对环境声进行分类。仿真结果表明所用特征及方法是有效的。相似文献

18.

基于听觉模型的自相关谱语音信号的重构

王永琦史水娥杨豪强刘小先《河南师范大学学报(自然科学版)》2004,32(1):110-112

研究如何从听觉模型的自相关谱中恢复出原始的声音信号.从短时自相关函数中得到原始信号的傅立叶变换的幅度值,然后利用迭代算法仅从傅立叶变换的幅度值中恢复语音信号. 相似文献

19.

语音信号变速算法及其TMS320C5402实时实现 总被引：2，自引：0，他引：2

刘耦耕贺素良龙永红《中南大学学报(自然科学版)》2004,35(1):117-121

语音信号可表述成激励源和线性时变系统的冲激响应的卷积.若激励源是一白噪声信号,则声道发清音;若激励源是一准周期信号,则声道发浊音.在语音信号序列中影响语音表达速度的是浊音,它是基音,是由多次谐波构成的准周期信号.语音信号序列可以看成是基音周期经整数倍延时后叠加而成.插入部分基音周期使语音速度降低,删除部分基音周期能使语音速度提高.但是,插入或删除基音周期使语音信号的相位不连续,造成语音跳变,为此,采用交叠分帧的方法将语音信号划分成短时段序列.然后,求增或删后短时段序列相邻短时段之间的相关函数,进而求出相关函数极大值.按最大相关点实现相邻短时段之间的相位衔接,使其相位接近连续.此外,提出了一个TMS320C5402和AT89C51双处理机系统的硬件设计方案,将语音变速算法用TMS320C5402和人-机交互用单片机AT89C51实现. 相似文献

20.

听觉诱发电位与声舒适度的关联性初探

管宏宇胡松涛刘国丹陈晗《科学技术与工程》2018,18(28)

为了研究声舒适度与听觉诱发电位的关系,使用诱发电位仪记录了不同频率和不同声压级刺激下的听觉诱发电位。结果表明：随着声音频率的提高,诱发电位的波峰越密集,这与声音本身的频率特性相匹配;随着声压级的增大,诱发电位幅值增大,这解释了人体主观烦恼度随着声压级增大而增加的现象。在不同频率和不同声压级刺激下,Ⅴ波较其他波形稳定,说明人体脑桥上段或中脑下端对声音特性更敏感。该研究可为不同声环境下人体声舒适度机理研究做参考。相似文献