首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 640 毫秒
1.
本研究基于下一代测序技术,对黄连基因组进行了勘测,构建了两个插入片段大小分别为200bp和500bp的文库,进行了深度约30X的测序。通过测序获得了54Gb的原始数据,过滤后得到44.8G数据。通过SOAP de nove软件组装后初步获得了contig和Scaffold序列,进一步分析结果显示其基因组大小为1,116Mb左右,大约具有1.1%的杂合度,说明要完成该物种的全基因测序可能在使用鸟枪法的同时,还应该联合BAC文库测序等多种方法.对这些数据进行了初步的组装,获得了130,381条scaffold序列.  相似文献   

2.
全基因组测序技术研究及其在木本植物中的应用   总被引:2,自引:0,他引:2  
基因组序列是开展遗传研究重要的信息基础,随着测序技术飞速发展至第3代长片段测序方法,测序读长历经从几十到数万个碱基的提升,对进一步提升基因组组装的完整度以及准确性提供了极大的裨益。现已完成了大量植物种全基因组测序工作,其中木本植物有40多个,还有更多树种的全基因组测序正在进行之中。针对各类测序技术的基因组组装及后续分析,研究人员也开发了大量的生物信息学工具。笔者从测序技术、基因组装技术和全基因组测序生物信息学分析等方面,罗列了目前已完成全基因组测序的木本植物,介绍了全基因组测序技术的发展与应用,以及适用于第3代数据基因组组装的生物学分析软件,为林木基因组研究者提供一定的借鉴。  相似文献   

3.
厚叶木莲(Manglietia pachyphylla)为木兰科(Magnoliaceae)木莲属(Manglietia)的木本植物,零星分布于我国广东省和广西壮族自治区,为国家二级重点保护野生植物。了解濒危物种基因组信息及其遗传多样性有助于合理地保护和利用濒危物种,实现濒危物种的解濒和复壮。为此,本研究通过高通量测序方法对厚叶木莲基因组进行测序,并利用测序数据开展厚叶木莲基因组草图的组装;之后,基于组装的基因组预测其中的重复序列和基因,进行系统发育和基因家族分析。结果表明,组装的厚叶木莲基因组大小为2 092 298 891 bp,包含676个组装序列,N50(将组装的序列按照长度由大到小进行累加,当累加到某个序列时,累加的值为基因组50%的长度时,此序列的长度即为N50)为7 961 115 bp;利用BUSCO (Benchmarking Universal Single-Copy Orthologs),针对“eudicots”和“embryophyta”这两个BUSCO单拷贝基因库,对基因组组装的完整性进行评估,组装的厚叶木莲基因组完整性分别为96.6%和98.8%。厚叶木莲基因组有76.5%的序列为重复序列,共有37 900个基因,这些基因编码了41 675个蛋白质序列。系统发育分析发现厚叶木莲与望春玉兰(Magnolia biondii)聚在一起,两者分化时间大致为10 500 000年前。厚叶木莲中与木质部/韧皮部、肌动蛋白丝、热、光合作用以及多种次生代谢相关的基因家族显著扩张,其中次生代谢相关基因在厚叶木莲基因组上呈串联和近端重复,这些基因的扩张和重复形成方式可能与厚叶木莲适应高海拔环境有关。本研究是国内外木兰科木莲属首个基因组报道,为更好地保护和开发厚叶木莲及木兰科其他物种的种质资源提供了遗传信息和参考。  相似文献   

4.
ERIC序列在不同细菌基因组中分布的分析   总被引:16,自引:2,他引:14  
重复 DNA序列是细菌基因组中的一个重要组成部分 ,而 ERIC(IRU )序列是在肠道细菌基因组中发现的一类基因间重复序列。本研究使用 HMMER软件建立 ERIC序列族的模型 ,并用该模型对 4 6个已测序的菌株进行全基因组序列搜索 ,并对搜索结果进行比较分析 ,找出 ERIC(IRU )序列在不同细菌基因组中的分布规律 ,即 ERIC序列主要存在于肠道细菌中 ,并不是在细菌基因组中普遍存在的  相似文献   

5.
马克斯克鲁维酵母(Kluyveromyces marxianus)是一种富有潜力的新型细胞工厂宿主菌。K.marxianus FIM1(CGMCC No.10621)基因组组装与注释已于2019年完成并在NCBI上公布(GCA_001854445.2),但在应用该基因组信息时发现其仍不完善。所以本研究使用重测序的DNA-seq数据校验了K.marxianus FIM1的基因组序列,随后补充了K.marxianus FIM1的注释,并使用比较基因组学的方法进一步完善了K.marxianus FIM1的基因组信息。根据重测序比对结果,删除了测序数据无法覆盖的位点61处,共4 910 bp。K.marxianus FIM1新序列总长为10 909 543 bp,可覆盖酵母目保守基因库中99.5%的基因。同时对K.marxianus FIM1的非编码RNA、次级代谢产物基因簇的也进行了补充注释。进一步,我们使用基因组共线性分析的方法分析了K.marxianus FIM1在物种分化过程中基因排列顺序保守的区域,并将K.marxianus FIM1与NCBI公布的11株K.marxianus进行了...  相似文献   

6.
为了获得琼胶寡糖的高产琼胶降解菌,本研究通过Illumina Hiseq2000平台测序,使用SOAPdenovo2.04软件(拼接组装),Glimmer 3.02预测基因的开放性阅读框,RNAmmer 1.2预测rRNA,tRNAscan-SE 1.23预测tRNA以及COG、CO和KEGG等来预测FG12的基因功能,并对其进行了全基因组测序,结果表明:FG12的基因组大小为4.11Mb,GC含量为37.76%;共有4 441个开放性阅读框,其平均长度为780 bp,有76个tRNA和5个rRNA;有与琼胶寡糖代谢相关的基因和抗生素的代谢途径,因此得出,琼胶降解菌FG12有改造成为高效工程菌的潜力.  相似文献   

7.
该文采用Illumina 高通量测序技术对地芽孢杆菌(Geobacillus sp. YHL)进行全基因组测序,使用Velet软件进行组装,利用Glimmer软件对菌株进行基因预测,得到的蛋白质通过与COG、KEGG等数据库进行比对来获得相应的注释信息.利用多种绘图工具对注释信息进行汇总及分析,获得了COG、KEGG等多种基础注释信息,对这些信息进行挖掘分析,研究结果发现:该菌株具有多种编码酶基因,包括糖苷水解酶、葡糖苷酶、木聚糖酶、淀粉酶、新普鲁兰酶、支链淀粉酶和脂肪酶,是一种嗜热的多酶编码菌,有一定的应用潜力.重点关注了在基因组中编码热应激蛋白基因,这些基因信息最终可以提供关于细菌的热适应机制的初步解释.  相似文献   

8.
为了解析药用植物毛茛铁线莲的叶绿体全基因组特征及系统发育位置,利用高通量测序技术对毛茛铁线莲叶片样品进行测序,并对毛茛铁线莲叶绿体基因组进行组装、注释和特征分析,采用最大简约法(MP)、最大似然法(ML)及贝叶斯法(BI)构建分子系统发育树.结果表明,毛茛铁线莲的叶绿体基因组为典型的四分体结构,全长159 741 bp,共编码112个基因,包括79个蛋白编码基因、29个tRNA基因和4个rRNA基因;毛茛铁线莲叶绿体基因组共含有84个简单重复序列,以单核苷酸重复基序居多,共57个;毛茛铁线莲叶绿体基因组使用频次最高的氨基酸是亮氨酸;基于叶绿体全基因组构建的MP,ML和BI树,拓扑结构基本一致;铁线莲属是明显的单系类群,与银莲花属关系最近,毛茛铁线莲与绣球藤亲缘关系最近.  相似文献   

9.
本实验旨在对绵羊肌肉转录组进行组装分析,以丰富绵羊的基因组并为进一步对绵羊的相关基因的分子遗传学研究奠定基础。选取Illumina Hiseq 2000测序平台对杜泊羊和小尾寒羊的骨骼肌转录组文库进行了高通量测序,并对测序数据进行从头组装分析。结果表明:两个测序文库共得到103058824个长为2×90bp的高质量的测序序列,经从头组装共产生了145524个unigenes。两个文库间有5718个unigenes差异表达,经GO注释分析后发现它们主要分布在细胞、细胞过程和结合条目中。将两个文库的unigenes进一步组装得到70348个平均长为863bp的all-unigenes,其中35201个被注释到Nr数据库,12219个被注释到COG数据库,并与KEGG数据库中的258个功能通路中的蛋白质和氨基酸新陈代谢通路密切相关。  相似文献   

10.
从1名尿路感染患者中分离出了1株多药耐药的Comamonas kerstersii(C.kerstersii)菌株121606,对其进行了抗微生物药敏试验(AST)和全基因组测序;然后将其与7个具有代表性的Comamonas菌株和Acidovorax菌种进行基因组比较分析,包括使用OrthoANI分析平均核苷酸同一性(ANI),以及通过snpTree网络服务器进行单核苷酸多态性(SNP)分析。最后,使用RAST服务器进行基因组序列,使用OrthoVenn软件对同源簇进行功能注释,通过CARD数据库对抗生素耐药基因(ARGs)进行预测,并利用CRISPR识别工具预测CRISPR,以及利用PHAST软件预测前噬菌体。结果表明:C.kerstersii 121606是一种多药耐药细菌,其遗传成分与其他7个Comamonas和Acidovorax菌种相似;其基因组中存在的ARGs有助于解释其多药耐药机制。这些发现为研究新型抗生素来控制多药耐药C.kerstersii感染提供了有价值的见解。  相似文献   

11.
为揭示球状轮藻叶绿体全基因组的特征以及探究其在轮藻科内的系统发育关系,本研究基于高通量测序技术对其叶绿体全基因组进行组装和序列分析.结果表明:球状轮藻叶绿体基因组全长180 652 bp, GC含量26.6%,具有典型的四分体环状结构,与普生轮藻十分类似;球状轮藻叶绿体基因组共注释出137个基因,其中包括94个蛋白质编码基因、37个tRNA基因和6个rRNA基因,比无色丽藻多2个蛋白质编码基因和3个tRNA基因,与高等植物相比具有rpl12、trnL(gag)、rpl19、ycf20四个特殊基因;球状轮藻叶绿体全基因组共检测出87个SSR位点且绝大部分由A和T构成;此外,球状轮藻共包含24 989个密码子且密码子使用更偏好A和T,亮氨酸(Leu)是编码氨基酸最多的密码子;通过邻近法(NJ)对包括球状轮藻在内的5个种的叶绿体全基因组构建系统发育树显示,球状轮藻的亲缘关系与普生轮藻更为接近.本研究对球状轮藻叶绿体全基因组进行了解析,利用现有数据确立其系统发育学地位.  相似文献   

12.
Since the sequencing of the first two chromosomes of the malaria parasite, Plasmodium falciparum, there has been a concerted effort to sequence and assemble the entire genome of this organism. Here we report the sequence of chromosomes 1, 3-9 and 13 of P. falciparum clone 3D7--these chromosomes account for approximately 55% of the total genome. We describe the methods used to map, sequence and annotate these chromosomes. By comparing our assemblies with the optical map, we indicate the completeness of the resulting sequence. During annotation, we assign Gene Ontology terms to the predicted gene products, and observe clustering of some malaria-specific terms to specific chromosomes. We identify a highly conserved sequence element found in the intergenic region of internal var genes that is not associated with their telomeric counterparts.  相似文献   

13.
Genome sequence and analysis of the tuber crop potato   总被引:11,自引:0,他引:11  
Potato (Solanum tuberosum L.) is the world's most important non-grain food crop and is central to global food security. It is clonally propagated, highly heterozygous, autotetraploid, and suffers acute inbreeding depression. Here we use a homozygous doubled-monoploid potato clone to sequence and assemble 86% of the 844-megabase genome. We predict 39,031 protein-coding genes and present evidence for at least two genome duplication events indicative of a palaeopolyploid origin. As the first genome sequence of an asterid, the potato genome reveals 2,642 genes specific to this large angiosperm clade. We also sequenced a heterozygous diploid clone and show that gene presence/absence variants and other potentially deleterious mutations occur frequently and are a likely cause of inbreeding depression. Gene family expansion, tissue-specific expression and recruitment of genes to new pathways contributed to the evolution of tuber development. The potato genome sequence provides a platform for genetic improvement of this vital crop.  相似文献   

14.
对稻属异源四倍体中染色体组C和D以及稻属现存的所有二倍体染色体组A、B、C、E、F的乙醇脱氢酶基因(Adh1)片段分别进行PCR扩增、克隆和序列测定,并以G染色体组序列作为外类群,采用PAUP运算软件中的简约性方法对所测定的序列进行了系统发育分析.结果表明:(1)3个CCDD四倍体是同一次杂交事件的产物;(2)四倍体中的C染色体组和亚洲二倍体中的C染色体组表现出更近的系统发育关系;(3)D染色体组和E染色体组表现出较近的亲缘关系,二者可能有共同的祖先.  相似文献   

15.
Wang J  Wang W  Li R  Li Y  Tian G  Goodman L  Fan W  Zhang J  Li J  Zhang J  Guo Y  Feng B  Li H  Lu Y  Fang X  Liang H  Du Z  Li D  Zhao Y  Hu Y  Yang Z  Zheng H  Hellmann I  Inouye M  Pool J  Yi X  Zhao J  Duan J  Zhou Y  Qin J  Ma L  Li G  Yang Z  Zhang G  Yang B  Yu C  Liang F  Li W  Li S  Li D  Ni P  Ruan J  Li Q  Zhu H  Liu D  Lu Z  Li N  Guo G  Zhang J  Ye J  Fang L  Hao Q  Chen Q  Liang Y  Su Y  San A  Ping C  Yang S  Chen F  Li L  Zhou K  Zheng H  Ren Y  Yang L  Gao Y  Yang G  Li Z  Feng X  Kristiansen K  Wong GK  Nielsen R  Durbin R  Bolund L  Zhang X 《Nature》2008,456(7218):60-65
Here we present the first diploid genome sequence of an Asian individual. The genome was sequenced to 36-fold average coverage using massively parallel sequencing technology. We aligned the short reads onto the NCBI human reference genome to 99.97% coverage, and guided by the reference genome, we used uniquely mapped reads to assemble a high-quality consensus sequence for 92% of the Asian individual's genome. We identified approximately 3 million single-nucleotide polymorphisms (SNPs) inside this region, of which 13.6% were not in the dbSNP database. Genotyping analysis showed that SNP identification had high accuracy and consistency, indicating the high sequence quality of this assembly. We also carried out heterozygote phasing and haplotype prediction against HapMap CHB and JPT haplotypes (Chinese and Japanese, respectively), sequence comparison with the two available individual genomes (J. D. Watson and J. C. Venter), and structural variation identification. These variations were considered for their potential biological impact. Our sequence data and analyses demonstrate the potential usefulness of next-generation sequencing technologies for personal genomics.  相似文献   

16.
Anaerobic ammonium oxidation (anammox) has become a main focus in oceanography and wastewater treatment. It is also the nitrogen cycle's major remaining biochemical enigma. Among its features, the occurrence of hydrazine as a free intermediate of catabolism, the biosynthesis of ladderane lipids and the role of cytoplasm differentiation are unique in biology. Here we use environmental genomics--the reconstruction of genomic data directly from the environment--to assemble the genome of the uncultured anammox bacterium Kuenenia stuttgartiensis from a complex bioreactor community. The genome data illuminate the evolutionary history of the Planctomycetes and allow us to expose the genetic blueprint of the organism's special properties. Most significantly, we identified candidate genes responsible for ladderane biosynthesis and biological hydrazine metabolism, and discovered unexpected metabolic versatility.  相似文献   

17.
从猪肝脏提取基因组作为模板,分别扩增了Klf4、Klf5和Egr2的第3、第2和第1内含子,长度分别为916、1027和1342bp,并通过其两端连接的部分外显子序列与Genbank序列比对加以确认,并和人相应基因内含子作长度和序列同源性比较。结果表明,由内含子比对得出的这些基因在人和猪间的保守程度与这些基因在氨基酸水平上比对得出的保守程度相一致。  相似文献   

18.
The sequence of the rice genome holds fundamental information for its biology, including physiology, genetics, development, and evolution, as well as information on many beneficial phenotypes of economic significance. Using a “whole genome shotgun” approach, we have produced a draft rice genome sequence ofOryza sativa ssp.indica, the major crop rice subspecies in China and many other regions of Asia. The draft genome sequence is constructed from over 4.3 million successful sequencing traces with an accumulative total length of 2214.9 Mb. The initial assembly of the non-redundant sequences reached 409.76 Mb in length, based on 3.30 million successful sequencing traces with a total length of 1797.4 Mb from anindica variant cultivar93-11, giving an estimated coverage of 95.29% of the rice genome with an average base accuracy of higher than 99%. The coverage of the draft sequence, the randomness of the sequence distribution, and the consistency of BIG-ASSEMBLER, a custom-designed software package used for the initial assembly, were verified rigorously by comparisons against finished BAC clone sequences from bothindica andjapanica strains, available from the public databases. Over all, 96.3% of full-length cDNAs, 96.4% of STS, STR, RFLP markers, 94.0% of ESTs and 94.9% unigene clusters were identified from the draft sequence. Our preliminary analysis on the data set shows that our rice draft sequence is consistent with the comman standard accepted by the genome sequencing community. The unconditional release of the draft to the public also undoubtedly provides a fundamental resource to the international scientific communities to facilitate genomic and genetic studies on rice biology. These authors contributed equally to this work.  相似文献   

19.
The map-based sequence of the rice genome   总被引:14,自引:0,他引:14  
Rice, one of the world's most important food plants, has important syntenic relationships with the other cereal species and is a model plant for the grasses. Here we present a map-based, finished quality sequence that covers 95% of the 389 Mb genome, including virtually all of the euchromatin and two complete centromeres. A total of 37,544 non-transposable-element-related protein-coding genes were identified, of which 71% had a putative homologue in Arabidopsis. In a reciprocal analysis, 90% of the Arabidopsis proteins had a putative homologue in the predicted rice proteome. Twenty-nine per cent of the 37,544 predicted genes appear in clustered gene families. The number and classes of transposable elements found in the rice genome are consistent with the expansion of syntenic regions in the maize and sorghum genomes. We find evidence for widespread and recurrent gene transfer from the organelles to the nuclear chromosomes. The map-based sequence has proven useful for the identification of genes underlying agronomic traits. The additional single-nucleotide polymorphisms and simple sequence repeats identified in our study should accelerate improvements in rice production.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号