首页 | 本学科首页   官方微博 | 高级检索  
     

基于多特征融合的SVM声学场景分类算法研究
引用本文:赵薇,靳聪,涂中文,SRIDHAR Krishnan,刘杉. 基于多特征融合的SVM声学场景分类算法研究[J]. 北京理工大学学报, 2020, 40(1): 69-75. DOI: 10.15918/j.tbit1001-0645.2018.171
作者姓名:赵薇  靳聪  涂中文  SRIDHAR Krishnan  刘杉
作者单位:1. 中国传媒大学 信息与通信工程学院, 北京 100024;
基金项目:国家自然科学基金资助项目(61631016,61901421);中央高校基本科研业务费专项基金(CUC19ZD003)
摘    要:针对DCASE2017挑战赛的声场环境数据集,提取梅尔频率倒谱系数(MFCC)、短时能量(SE)、声学事件似然特征(AELF)、静音时间(MT)特征,组成多特征融合矩阵,通过对比多种核函数和寻优算法,最终选取高斯径向基核函数(RK)建立支持向量机(SVM)模型,采用交叉验证(CV)方法进行SVM参数寻优,对15种声学场景进行分类.实验结果表明,杂货店、办公室的分类准确性达到了90%以上,平均分类准确性达到71.11%,远高于挑战赛的基线系统61%的平均分类准确性.

关 键 词:声学场景分类  支持向量机  参数寻优  特征融合
收稿时间:2018-04-17

Support Vector Machine for Acoustic Scene Classification Algorithm Research Based on Multi-Features Fusion
ZHAO Wei,JIN Cong,TU Zhong-wen,SRIDHAR Krishnan and LIU Shan. Support Vector Machine for Acoustic Scene Classification Algorithm Research Based on Multi-Features Fusion[J]. Journal of Beijing Institute of Technology(Natural Science Edition), 2020, 40(1): 69-75. DOI: 10.15918/j.tbit1001-0645.2018.171
Authors:ZHAO Wei  JIN Cong  TU Zhong-wen  SRIDHAR Krishnan  LIU Shan
Affiliation:1. School of Information and Communication Engineering, Communication University of China, Beijing 100024, China;2. School of Broadcasting and Hosting Art, Communication University of China, Beijing 100024, China;3. Department of Electrical and Computer Engineering, Ryerson University, Toronto M5B 2K3, Canada
Abstract:For the sound environment dataset of the DCASE 2017 Challenge, Mel frequency cepstral coefficients (MFCC), short-time energy (SE), acoustic event likelihood features (AELF), and mute time (MT) features were extracted to form a multi-features fusion matrix. Comparing various kernel functions and optimization algorithms, radial basis function kernel (RK) was finally selected to establish the support vector machine (SVM) model, and cross validation (CV) method was utilized to optimize SVM parameters and to classify 15 acoustic scenes. The experimental results show that the classification accuracy of grocery store and office can reach more than 90%, and the average classification accuracy reaches 71.11%, which is much higher than the average classification accuracy of 61% of the baseline system given in the challenge.
Keywords:acoustic scene classification  support vector machine  parameter optimization  feature fusion
本文献已被 CNKI 等数据库收录!
点击此处可从《北京理工大学学报》浏览原始摘要信息
点击此处可从《北京理工大学学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号