首页 | 本学科首页   官方微博 | 高级检索  
     

融合PPI网络和基因表达的复合物识别算法
引用本文:李敏,武学鸿,费耀平. 融合PPI网络和基因表达的复合物识别算法[J]. 系统工程理论与实践, 2014, 34(2): 437-443. DOI: 10.12011/1000-6788(2014)2-437
作者姓名:李敏  武学鸿  费耀平
作者单位:中南大学 信息科学与工程学院, 长沙 410083
摘    要:从大规模相互作用网络中识别蛋白质复合物,对解释特定的生物进程和预测蛋白质功能具有重要作用,同时也是后基因组时代一 个最重要的研究课题. 考虑到传统仅基于蛋白质相互作用网络(PPI网络)的蛋白质复合物识别算法可靠性不高,本文提出 了一种新的融合PPI网络和基因表达数据的蛋白质复合物识别算法IPCIPG. 区别于之前用基因表达数据评估PPI网络可靠性的做法,本文提出在蛋白质复合物的识别过程中将PPI网络和基因表达数据有机地结合起来. 算法IPCIPG首先根据边聚集系数(ECC)与蛋 白质间共表达的相关性(PCC)计算PPI网络中每个节点的权重,权重最大的节点作为种子,然后从种子节点开始扩充生成稠密子图. 基于酵母数据集的实验结果表明,算法IPCIPG较其他算法HUNTER,HC-PIN,CMC,SPICI,MOCDE,MCL能够更准确,更有效地 识别出具有特定生物意义的蛋白质复合物.

关 键 词:系统生物学  蛋白质相互作用网络  蛋白质复合物  基因表达数据  
收稿时间:2011-07-11

An algorithm for identifying protein complexes based on the integration of PPI network and Gene expression
LI Min,WU Xue-hong,FEI Yao-ping. An algorithm for identifying protein complexes based on the integration of PPI network and Gene expression[J]. Systems Engineering —Theory & Practice, 2014, 34(2): 437-443. DOI: 10.12011/1000-6788(2014)2-437
Authors:LI Min  WU Xue-hong  FEI Yao-ping
Affiliation:School of Information Science and Engineering, Central South University, Changsha 410083, China
Abstract:Identifying protein complexes from the large-scale protein interaction network is crucial to understand principles of cellular organization and predict protein functions, which is one of the most important issues in the post-genomic era. Generally, the traditional protein complex discovery algorithms are only based on the protein-protein interaction network (PPI network), and are not so accurate. In this paper, a novel algorithm IPCIPG is proposed based on the integration of the PPI network and the gene expression data. Different from other previous methods which use gene expression data to evaluate the reliability of PPIs, IPCIPG integrates the gene expression data into PPI network during the identification of protein complexes. IPCIPG uses the edge clustering coefficient (ECC) and the co-expression correlation between proteins (PCC) to calculate the weight of each node in the PPI network. And then the node with the highest weight is selected as seed, then, a dense sub-graph will be obtained by extending from the seed. The experiment results on the data of Saccharomyces cerevisiae show that IPCIPG can identify the protein complexes with specific biological meaning more effectively, precisely and comprehensively than the other algorithms HUNTER, HC-PIN, CMC, SPICI, MOCDE, and MCL.
Keywords:system biology  protein-protein interaction network  protein complexes  gene expression data
本文献已被 CNKI 等数据库收录!
点击此处可从《系统工程理论与实践》浏览原始摘要信息
点击此处可从《系统工程理论与实践》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号