基于T检验与支持向量机的蛋白质质谱数据分析 |
| |
引用本文: | 邹修明,罗楠,孙怀江. 基于T检验与支持向量机的蛋白质质谱数据分析[J]. 淮阴师范学院学报(自然科学版), 2011, 10(5): 409-413 |
| |
作者姓名: | 邹修明 罗楠 孙怀江 |
| |
作者单位: | 1. 南京理工大学计算机科学与技术学院,江苏南京210094/淮阴师范学院物理与电子电气工程学院,江苏淮安223300 2. 南京理工大学计算机科学与技术学院,江苏南京,210094 |
| |
摘 要: | 对蛋白质质谱数据进行模式识别成为癌症诊断的一种新方法,但质谱数据存在高维小样本问题,因而数据分析面临着巨大挑战.在对原始数据进行基线校正与标准化并用分箱法进行降维预处理的基础上,提出用T检验方法选取特征,对蛋白质质谱数据进行分析研究.实验对卵巢质谱数据集进行分类,用10-fold交叉验证法选择训练和测试样本,以支持向量...
|
关 键 词: | 蛋白质质谱 分箱法 T-检验 支持向量机 |
Protein Mass Spectrometry Data Analysis Based on T test and Support Vector Machine |
| |
Abstract: | The pattern analysis to protein mass spectrometry data becomes a new method of cancer diagnosis.But there exists high dimensional and small sample size problem in protein mass spectrometry data,which brings a big challenge to data analysis.Based on dimension reduction preprocessing to raw data by using baseline correction and binning standardization,propose T test to select features to analysis protein mass spectrometry data.In the experiment classify ovarian mass dataset,use 10-fold cross validation to get training and testing data and use SVM as the classifier,the results shows the method propose only selects a small feature subset,and have a very high recognition rate.Its Sensitivity,specificity,and overall recognition rate has reached 100%,96.7% and 98.8%. |
| |
Keywords: | protein mass spectrometry binning T-test support vector machine |
本文献已被 万方数据 等数据库收录! |
|