首页 | 本学科首页   官方微博 | 高级检索  
     检索      

模式生物的外显子、内含子和基因间序列的识别
引用本文:陈翠霞,李前忠,林昊.模式生物的外显子、内含子和基因间序列的识别[J].内蒙古大学学报(自然科学版),2005,36(2):166-172.
作者姓名:陈翠霞  李前忠  林昊
作者单位:内蒙古大学理工学院物理系,呼和浩特,010021;内蒙古大学理工学院物理系,呼和浩特,010021;内蒙古大学理工学院物理系,呼和浩特,010021
基金项目:国家自然科学基金项目 (3 0 1 6 0 0 2 5)~~
摘    要:基于核酸序列在剪切位点上保守性、组分的不同和编码序列阅读框架的3周期性,模式生物全基因组序列分为外显子、内含子和基因间序列三类.三个标准离散源分别由64个三联体在整条序列上的概率和4个碱基序列首尾(剪切位点附近)共30个位点上的概率共同构成.某条序列的类型就由该序列的离散量同相应区间上三个标准离散量的离散增量确定.结果表明:具有184个信号参数的离散量预测比只有64个三联体参数的结果要高出5%,总体预测成功率:线虫为87.37%,拟南芥为91.08%,果蝇为92.28%,原核生物大肠杆菌的二种序列预测率为92.88%,酵母菌为94.88%.

关 键 词:外显子  内含子  基因间序列  剪切位点  离散增量

The Identification of Exon Intron and Intergenic DNA in the Model Species Genomes
CHEN Cui-xia,LI Qian-zhong,LIN Hao.The Identification of Exon Intron and Intergenic DNA in the Model Species Genomes[J].Acta Scientiarum Naturalium Universitatis Neimongol,2005,36(2):166-172.
Authors:CHEN Cui-xia  LI Qian-zhong  LIN Hao
Abstract:Based on the conservation of nucleotides around splice sites,and the compositional feature and the existence of reading frames with 3-periodicity in a coding sequence,the complete sequences of the 5 model species genomes are grouped under three kinds:intron,exon and intergenic DNA. The three standard sources of diversity are respectively determined by the probabilities (bp/kb) of the 64 trimers and of the 4 bases at 30 positions around the splice sites. The classification of one sequence can be determined by the increment of diversity. The prediction results with 184 information signals of all sets are better than that only with 64 signals. The prediction accuracy with 184 signals are respectively about 87.37%, 91.08%, 92.28%,92.88% and 94.88% for C.elegans(C),A.thaliana(A), D.melanogasters (D), E.coli (E) and S.cerevisiae (S) genome.
Keywords:exon  intron  intergenic DNA  splice site  the increment of diversity
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号