首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 734 毫秒
1.
目的 指出当前已有的基于三代测序数据的基因组组装方法的缺陷,并提出改进措施,以提高组装的准确率与运行效率。方法 深入分析当前基于三代长读长测序技术的基因组组装方法,包括基于“校正后组装”策略的FALCON,Canu和MECAT组装方法,基于“组装后校正”策略的Flye和Wtdbg2组装方法,指出不同策略的优缺点。结果与结论综合2种组装策略的优势,提出了可以融合2种组装策略优势的新的基因组组装方案,解决了当前基于三代测序数据的基因组组装中的难点。  相似文献   

2.
全基因组测序技术研究及其在木本植物中的应用   总被引:2,自引:0,他引:2  
基因组序列是开展遗传研究重要的信息基础,随着测序技术飞速发展至第3代长片段测序方法,测序读长历经从几十到数万个碱基的提升,对进一步提升基因组组装的完整度以及准确性提供了极大的裨益。现已完成了大量植物种全基因组测序工作,其中木本植物有40多个,还有更多树种的全基因组测序正在进行之中。针对各类测序技术的基因组组装及后续分析,研究人员也开发了大量的生物信息学工具。笔者从测序技术、基因组装技术和全基因组测序生物信息学分析等方面,罗列了目前已完成全基因组测序的木本植物,介绍了全基因组测序技术的发展与应用,以及适用于第3代数据基因组组装的生物学分析软件,为林木基因组研究者提供一定的借鉴。  相似文献   

3.
在已有测序数据基础上,利用三种常见的序列组装软件对Paenibacillus Shenyangensis全基因组测序结果进行拼接组装,分析比较了不同软件在各自最优参数条件下DNA序列的组装数据,并与NCBI数据库中类芽孢杆菌属其他近缘种进行基因比对与预测.结果表明,SOAPdenovo的组装结果最优,在k-mer为23时,组装基因组总长和N50分别为5 501 467和293 864 bp,预测的4 800个基因中有4 393个与NCBI-Nr数据库比对并注释成功.  相似文献   

4.
厚叶木莲(Manglietia pachyphylla)为木兰科(Magnoliaceae)木莲属(Manglietia)的木本植物,零星分布于我国广东省和广西壮族自治区,为国家二级重点保护野生植物。了解濒危物种基因组信息及其遗传多样性有助于合理地保护和利用濒危物种,实现濒危物种的解濒和复壮。为此,本研究通过高通量测序方法对厚叶木莲基因组进行测序,并利用测序数据开展厚叶木莲基因组草图的组装;之后,基于组装的基因组预测其中的重复序列和基因,进行系统发育和基因家族分析。结果表明,组装的厚叶木莲基因组大小为2 092 298 891 bp,包含676个组装序列,N50(将组装的序列按照长度由大到小进行累加,当累加到某个序列时,累加的值为基因组50%的长度时,此序列的长度即为N50)为7 961 115 bp;利用BUSCO (Benchmarking Universal Single-Copy Orthologs),针对“eudicots”和“embryophyta”这两个BUSCO单拷贝基因库,对基因组组装的完整性进行评估,组装的厚叶木莲基因组完整性分别为96.6%和98.8%。厚叶木莲基因组有76.5%的序列为重复序列,共有37 900个基因,这些基因编码了41 675个蛋白质序列。系统发育分析发现厚叶木莲与望春玉兰(Magnolia biondii)聚在一起,两者分化时间大致为10 500 000年前。厚叶木莲中与木质部/韧皮部、肌动蛋白丝、热、光合作用以及多种次生代谢相关的基因家族显著扩张,其中次生代谢相关基因在厚叶木莲基因组上呈串联和近端重复,这些基因的扩张和重复形成方式可能与厚叶木莲适应高海拔环境有关。本研究是国内外木兰科木莲属首个基因组报道,为更好地保护和开发厚叶木莲及木兰科其他物种的种质资源提供了遗传信息和参考。  相似文献   

5.
叶绿体基因组是植物基因组的重要组成部分,解析马铃薯及其野生近缘种的叶绿体基因组结构差异对理解马铃薯的进化具有重要的意义.选择Solanum fernandezianum、Solanum etuberosum、So-lanum palustre和Solanum phureja进行叶绿体基因组的组装和结构分析.发现它们叶绿...  相似文献   

6.
针对基因组组装问题,从数据预处理,利用KMP算法在O(m+n)的时间上快速确定某两个碱基片段的最大重复度,将读长序列依据Overlap图连成Contigs链以及Contigs N50的确定4个环节,改进现有的OLC拼接技术,并给出优化后的模型和算法,较好地解决了基因组组装问题.  相似文献   

7.
《科技导报(北京)》2009,27(24):14-14
中国首次提出"人类泛基因组"概念 由深圳华大基因研究院领衔、华南理工大学主要参与的合作研究成果"构建人类泛基因组序列图谱"发布,该研究树立了新的人类基因组测序标准,为未来医学研究指明了方向,反映出中国基因组学在世界的领先地位(Nature Biotechnology,doi:10.1038/nbt.1596)。该研究使用深圳华大基因研究院自主研发的第二代测序技术大基因组组装工具,对炎黄一号基因组(即首个亚洲人个人基因组)进行深度测序和拼接,发现了人类基因组中除原先公认的单核甘酸多态性、  相似文献   

8.
以大肠杆菌基因组为研究对象,基于体外组装的核小体序列中k-mers频数信息,采用多样性增量结合二次判别算法对核心DNA和连接DNA进行分类预测,整体准确率和相关系数分别达到83.08%和0.619.对大肠杆菌、酵母和人类基因组中核小体定位序列与缺失序列中偏好的k-mers进行了比较,结果表明核小体缺失序列更为保守.  相似文献   

9.
本研究基于下一代测序技术,对黄连基因组进行了勘测,构建了两个插入片段大小分别为200bp和500bp的文库,进行了深度约30X的测序。通过测序获得了54Gb的原始数据,过滤后得到44.8G数据。通过SOAP de nove软件组装后初步获得了contig和Scaffold序列,进一步分析结果显示其基因组大小为1,116Mb左右,大约具有1.1%的杂合度,说明要完成该物种的全基因测序可能在使用鸟枪法的同时,还应该联合BAC文库测序等多种方法.对这些数据进行了初步的组装,获得了130,381条scaffold序列.  相似文献   

10.
中国农科院作物科学研究所与深圳华大基因研究院等合作,在国际上率先完成了小麦D基因组供体种——粗山羊草基因组草图的绘制,结束了小麦没有组装基因组序列的历史。该项成果北京时间近日在线发表于《自然》杂志,标志着我国的小麦基因组研究跨入世界先进行列。  相似文献   

11.
She X  Jiang Z  Clark RA  Liu G  Cheng Z  Tuzun E  Church DM  Sutton G  Halpern AL  Eichler EE 《Nature》2004,431(7011):927-930
Complex eukaryotic genomes are now being sequenced at an accelerated pace primarily using whole-genome shotgun (WGS) sequence assembly approaches. WGS assembly was initially criticized because of its perceived inability to resolve repeat structures within genomes. Here, we quantify the effect of WGS sequence assembly on large, highly similar repeats by comparison of the segmental duplication content of two different human genome assemblies. Our analysis shows that large (> 15 kilobases) and highly identical (> 97%) duplications are not adequately resolved by WGS assembly. This leads to significant reduction in genome length and the loss of genes embedded within duplications. Comparable analyses of mouse genome assemblies confirm that strict WGS sequence assembly will oversimplify our understanding of mammalian genome structure and evolution; a hybrid strategy using a targeted clone-by-clone approach to resolve duplications is proposed.  相似文献   

12.
The human genome is by far the largest genome to be sequenced, and its size and complexity present many challenges for sequence assembly. The International Human Genome Sequencing Consortium constructed a map of the whole genome to enable the selection of clones for sequencing and for the accurate assembly of the genome sequence. Here we report the construction of the whole-genome bacterial artificial chromosome (BAC) map and its integration with previous landmark maps and information from mapping efforts focused on specific chromosomal regions. We also describe the integration of sequence data with the map.  相似文献   

13.
The recent breakthroughs in next-generation sequencing technologies, such as those of Roche 454,Illumina/Solexa, and ABI SOLID, have dramatically reduced the cost of producing short reads of the genome of new species. The huge volume of reads, along with short read length, high coverage, and sequencing errors, poses a great challenge to de novo genome assembly. However, the paired-end information provides a new solution to these problems. In this paper, we review and compare some current assembly tools, including Newbler, CAP3, Velvet,SOAPdenovo, AllPaths, Abyss, IDBA, PE-Assembly, and Telescoper. In general, we compare the seed extension and graph-based methods that use the overlap/lapout/consensus approach and the de Bruijn graph approach for assembly. At the end of the paper, we summarize these methods and discuss the future directions of genome assembly.  相似文献   

14.
A physical map of the mouse genome   总被引:1,自引:0,他引:1  
A physical map of a genome is an essential guide for navigation, allowing the location of any gene or other landmark in the chromosomal DNA. We have constructed a physical map of the mouse genome that contains 296 contigs of overlapping bacterial clones and 16,992 unique markers. The mouse contigs were aligned to the human genome sequence on the basis of 51,486 homology matches, thus enabling use of the conserved synteny (correspondence between chromosome blocks) of the two genomes to accelerate construction of the mouse map. The map provides a framework for assembly of whole-genome shotgun sequence data, and a tile path of clones for generation of the reference sequence. Definition of the human-mouse alignment at this level of resolution enables identification of a mouse clone that corresponds to almost any position in the human genome. The human sequence may be used to facilitate construction of other mammalian genome maps using the same strategy.  相似文献   

15.
Hyman RW  Fung E  Conway A  Kurdi O  Mao J  Miranda M  Nakao B  Rowley D  Tamaki T  Wang F  Davis RW 《Nature》2002,419(6906):534-537
The human malaria parasite Plasmodium falciparum is responsible for the death of more than a million people every year. To stimulate basic research on the disease, and to promote the development of effective drugs and vaccines against the parasite, the complete genome of P. falciparum clone 3D7 has been sequenced, using a chromosome-by-chromosome shotgun strategy. Here we report the nucleotide sequence of the third largest of the parasite's 14 chromosomes, chromosome 12, which comprises about 10% of the 23-megabase genome. As the most (A + T)-rich (80.6%) genome sequenced to date, the P. falciparum genome presented severe problems during the assembly of primary sequence reads. We discuss the methodology that yielded a finished and fully contiguous sequence for chromosome 12. The biological implications of the sequence data are more thoroughly discussed in an accompanying Article (ref. 3).  相似文献   

16.
The sequence of the rice genome holds fundamental information for its biology, including physiology, genetics, development, and evolution, as well as information on many beneficial phenotypes of economic significance. Using a “whole genome shotgun” approach, we have produced a draft rice genome sequence ofOryza sativa ssp.indica, the major crop rice subspecies in China and many other regions of Asia. The draft genome sequence is constructed from over 4.3 million successful sequencing traces with an accumulative total length of 2214.9 Mb. The initial assembly of the non-redundant sequences reached 409.76 Mb in length, based on 3.30 million successful sequencing traces with a total length of 1797.4 Mb from anindica variant cultivar93-11, giving an estimated coverage of 95.29% of the rice genome with an average base accuracy of higher than 99%. The coverage of the draft sequence, the randomness of the sequence distribution, and the consistency of BIG-ASSEMBLER, a custom-designed software package used for the initial assembly, were verified rigorously by comparisons against finished BAC clone sequences from bothindica andjapanica strains, available from the public databases. Over all, 96.3% of full-length cDNAs, 96.4% of STS, STR, RFLP markers, 94.0% of ESTs and 94.9% unigene clusters were identified from the draft sequence. Our preliminary analysis on the data set shows that our rice draft sequence is consistent with the comman standard accepted by the genome sequencing community. The unconditional release of the draft to the public also undoubtedly provides a fundamental resource to the international scientific communities to facilitate genomic and genetic studies on rice biology. These authors contributed equally to this work.  相似文献   

17.
Smith DJ  Whitehouse I 《Nature》2012,483(7390):434-438
Fifty per cent of the genome is discontinuously replicated on the lagging strand as Okazaki fragments. Eukaryotic Okazaki fragments remain poorly characterized and, because nucleosomes are rapidly deposited on nascent DNA, Okazaki fragment processing and nucleosome assembly potentially affect one another. Here we show that ligation-competent Okazaki fragments in Saccharomyces cerevisiae are sized according to the nucleosome repeat. Using deep sequencing, we demonstrate that ligation junctions preferentially occur near nucleosome midpoints rather than in internucleosomal linker regions. Disrupting chromatin assembly or lagging-strand polymerase processivity affects both the size and the distribution of Okazaki fragments, suggesting a role for nascent chromatin, assembled immediately after the passage of the replication fork, in the termination of Okazaki fragment synthesis. Our studies represent the first high-resolution analysis--to our knowledge--of eukaryotic Okazaki fragments in vivo, and reveal the interconnection between lagging-strand synthesis and chromatin assembly.  相似文献   

18.
Fyodorov DV  Kadonaga JT 《Nature》2002,418(6900):897-900
The assembly of DNA into chromatin is a critical step in the replication and repair of the eukaryotic genome. It has been known for nearly 20 years that chromatin assembly is an ATP-dependent process. ATP-dependent chromatin-assembly factor (ACF) uses the energy of ATP hydrolysis for the deposition of histones into periodic nucleosome arrays, and the ISWI subunit of ACF is an ATPase that is related to helicases. Here we show that ACF becomes committed to the DNA template upon initiation of chromatin assembly. We also observed that ACF assembles nucleosomes in localized arrays, rather than randomly distributing them. By using a purified ACF-dependent system for chromatin assembly, we found that ACF hydrolyses about 2#150;4 molecules of ATP per base pair in the assembly of nucleosomes. This level of ATP hydrolysis is similar to that used by DNA helicases for the unwinding of DNA. These results suggest that a tracking mechanism exists in which ACF assembles chromatin as an ATP-driven DNA-translocating motor. Moreover, this proposed mechanism for ACF may be relevant to the function of other chromatin-remodelling factors that contain ISWI subunits.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号