首页 | 本学科首页   官方微博 | 高级检索  
     

基于序列和结构特征的蛋白质自由能预测
引用本文:鲁帮力,陈庆锋,江家文,罗海琼. 基于序列和结构特征的蛋白质自由能预测[J]. 广西科学, 2017, 24(3): 286-291. DOI: 10.13656/j.cnki.gxkx.20170601.002
作者姓名:鲁帮力  陈庆锋  江家文  罗海琼
作者单位:1. 广西大学计算机与电子信息学院,广西南宁,530004;2. 广西大学计算机与电子信息学院,广西南宁 530004;广西大学亚热带农业生物资源保护与利用国家重点实验室,广西南宁 530004;3. 广西医科大学信息与管理学院,广西南宁,530021
基金项目:国家自然科学基金项目,广西自然科学基金重点项目
摘    要:【目的】蛋白质自由能不仅能准确地反应蛋白质的交互,而且对药物设计有巨大帮助。因此,选择建立精确的蛋白质自由能回归模型是非常有必要的。【方法】收集135对蛋白质复合物并计算600个特征,通过最小冗余最大相关(mRMR)选择与蛋白质自由能显著相关的特征并去除冗余特征,从而得到最小冗余最大相关的特征集,用筛选后的特征建立6种回归模型,并对选择后的特征进行移除对比分析特征的重要性;最后通过10折交叉验证对比得到最佳模型,预测蛋白质自由能。【结果】相对于其它方法,本研究所建立的模型在预测135对蛋白质复合物的性能,相对于其它方法有着较高的相关系数和较低平均绝对误差。【结论】本实验所用方法比其他方法选出的模型有更好的预测精度。

关 键 词:蛋白质交互  自由能  特征选择  回归模型
收稿时间:2017-03-25
修稿时间:2017-05-24

Protein Free Energy Prediction based on Sequence and Structure Features
LU Bangli,CHEN Qingfeng,JIANG Jiawen and LUO Haiqiong. Protein Free Energy Prediction based on Sequence and Structure Features[J]. Guangxi Sciences, 2017, 24(3): 286-291. DOI: 10.13656/j.cnki.gxkx.20170601.002
Authors:LU Bangli  CHEN Qingfeng  JIANG Jiawen  LUO Haiqiong
Affiliation:School of Computer, Electronics and Information in Guangxi University, Nanning, Guangxi, 530004, China,School of Computer, Electronics and Information in Guangxi University, Nanning, Guangxi, 530004, China;State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, Guangxi University, Nanning, Guangxi, 530004, China,School of Computer, Electronics and Information in Guangxi University, Nanning, Guangxi, 530004, China and School of Information and Management, Guangxi Medical University, Nanning, Guangxi, 530021, China
Abstract:[Objective]Protein free energy not only can accurately reflect the protein interaction,but also can be a great help to drug design and disease treatment. Therefore,it is necessary to establish an accurate regression model of protein free energy.[Methods]In this article,135 proteins complexes were collected and 600 features were calculated. Minimum redundancy maximum relevance algorithm was used to select features which were significantly related to protein free energy and removed redundant features. This was able to obtain the minimum redundancy maximum relevance feature sets. The importance of features was further analyzed by comparing the performance change by removing features. The best model was chosen to predict protein free energy by comparing the result of 10-fold cross validation.[Results]The model had a higher correlation coefficient and lower average absolute error in predicting the performance of 135 pairs of protein complexes compared with other methods.[Conclusion]The experimental results show that our method has better prediction accuracy than other methods.
Keywords:protein interaction  free energy  feature selection  regression model
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《广西科学》浏览原始摘要信息
点击此处可从《广西科学》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号