首页 | 本学科首页   官方微博 | 高级检索  
     

基于隐马尔可夫模型的DNA序列识别
引用本文:罗泽举,李艳会,宋丽红,朱思铭. 基于隐马尔可夫模型的DNA序列识别[J]. 华南理工大学学报(自然科学版), 2007, 35(8): 123-126
作者姓名:罗泽举  李艳会  宋丽红  朱思铭
作者单位:重庆工商大学,计算机科学与信息工程学院,重庆,400067;中山大学,数学与计算科学学院,广东,广州,510275;重庆工商大学,实验实习中心,重庆,400067
基金项目:国家自然科学基金 , 重庆市教委资助项目
摘    要:利用隐马尔可夫模型训练中不同结构的DNA序列的L值分布范围不同的特点,对传统多类投票模型进行改进,提出一种优于传统算法的快速训练算法,该算法只需训练出一类隐马尔可夫模型参数.对DNA内含子和外显子序列进行识别,平均识别率达到了90.8%.与支持向量机相比,隐马尔可夫模型在解决多分类问题方面具有优势,不但计算时间少,而且识别率高.

关 键 词:隐马尔可夫模型  DNA序列  内含子  外显子  识别  快速训练算法
文章编号:1000-565X(2007)08-0123-04
修稿时间:2006-08-26

Recognition of DNA Sequences Based on Hidden Markov Models
Luo Ze-ju,Li Yan-hui,Song Li-hong,Zhu Si-ming. Recognition of DNA Sequences Based on Hidden Markov Models[J]. Journal of South China University of Technology(Natural Science Edition), 2007, 35(8): 123-126
Authors:Luo Ze-ju  Li Yan-hui  Song Li-hong  Zhu Si-ming
Affiliation:1. School of Computer Science and Information Engineering, Chongqing Tech. and Business Univ. , Chongqing 400067, China; 2. School of Mathematics and Computational Science, Sun Yat-Sen Univ., Guangzhou 510275, Guangdong, China; 3. Center of Experiment and Practice, Chongqing Tech. and Business Univ. , Chongqing 400067, China
Abstract:According to the distribution variation of the L value with the DNA sequence structure in the hidden Markov model(HMM) training and by improving the traditional multiclass vote model,a fast training algorithm superior to the traditional one is proposed to recognize the intron and exon of the DNA sequence.The proposed algorithm only need to train one class of parameter of HMM model and the average accuracy rate of it reaches 90.8%.As compared with the support vector machine,the proposed HMM model is more feasible in the multiclass classification and is of less time cost and higher recognition rate.
Keywords:hidden Markov model  DNA sequence  intron  exon  recognition  fast training algorithm
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号