首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Wang J  Wang W  Li R  Li Y  Tian G  Goodman L  Fan W  Zhang J  Li J  Zhang J  Guo Y  Feng B  Li H  Lu Y  Fang X  Liang H  Du Z  Li D  Zhao Y  Hu Y  Yang Z  Zheng H  Hellmann I  Inouye M  Pool J  Yi X  Zhao J  Duan J  Zhou Y  Qin J  Ma L  Li G  Yang Z  Zhang G  Yang B  Yu C  Liang F  Li W  Li S  Li D  Ni P  Ruan J  Li Q  Zhu H  Liu D  Lu Z  Li N  Guo G  Zhang J  Ye J  Fang L  Hao Q  Chen Q  Liang Y  Su Y  San A  Ping C  Yang S  Chen F  Li L  Zhou K  Zheng H  Ren Y  Yang L  Gao Y  Yang G  Li Z  Feng X  Kristiansen K  Wong GK  Nielsen R  Durbin R  Bolund L  Zhang X 《Nature》2008,456(7218):60-65
Here we present the first diploid genome sequence of an Asian individual. The genome was sequenced to 36-fold average coverage using massively parallel sequencing technology. We aligned the short reads onto the NCBI human reference genome to 99.97% coverage, and guided by the reference genome, we used uniquely mapped reads to assemble a high-quality consensus sequence for 92% of the Asian individual's genome. We identified approximately 3 million single-nucleotide polymorphisms (SNPs) inside this region, of which 13.6% were not in the dbSNP database. Genotyping analysis showed that SNP identification had high accuracy and consistency, indicating the high sequence quality of this assembly. We also carried out heterozygote phasing and haplotype prediction against HapMap CHB and JPT haplotypes (Chinese and Japanese, respectively), sequence comparison with the two available individual genomes (J. D. Watson and J. C. Venter), and structural variation identification. These variations were considered for their potential biological impact. Our sequence data and analyses demonstrate the potential usefulness of next-generation sequencing technologies for personal genomics.  相似文献   

2.
A map of human genome variation from population-scale sequencing   总被引:2,自引:0,他引:2  
Genomes Project Consortium 《Nature》2010,467(7319):1061-1073
The 1000 Genomes Project aims to provide a deep characterization of human genome sequence variation as a foundation for investigating the relationship between genotype and phenotype. Here we present results of the pilot phase of the project, designed to develop and compare different strategies for genome-wide sequencing with high-throughput platforms. We undertook three projects: low-coverage whole-genome sequencing of 179 individuals from four populations; high-coverage sequencing of two mother-father-child trios; and exon-targeted sequencing of 697 individuals from seven populations. We describe the location, allele frequency and local haplotype structure of approximately 15 million single nucleotide polymorphisms, 1 million short insertions and deletions, and 20,000 structural variants, most of which were previously undescribed. We show that, because we have catalogued the vast majority of common variation, over 95% of the currently accessible variants found in any individual are present in this data set. On average, each person is found to carry approximately 250 to 300 loss-of-function variants in annotated genes and 50 to 100 variants previously implicated in inherited disorders. We demonstrate how these results can be used to inform association and functional studies. From the two trios, we directly estimate the rate of de novo germline base substitution mutations to be approximately 10(-8) per base pair per generation. We explore the data with regard to signatures of natural selection, and identify a marked reduction of genetic variation in the neighbourhood of genes, due to selection at linked sites. These methods and public data will support the next phase of human genetic research.  相似文献   

3.
Genetic variation among individual humans occurs on many different scales, ranging from gross alterations in the human karyotype to single nucleotide changes. Here we explore variation on an intermediate scale--particularly insertions, deletions and inversions affecting from a few thousand to a few million base pairs. We employed a clone-based method to interrogate this intermediate structural variation in eight individuals of diverse geographic ancestry. Our analysis provides a comprehensive overview of the normal pattern of structural variation present in these genomes, refining the location of 1,695 structural variants. We find that 50% were seen in more than one individual and that nearly half lay outside regions of the genome previously described as structurally variant. We discover 525 new insertion sequences that are not present in the human reference genome and show that many of these are variable in copy number between individuals. Complete sequencing of 261 structural variants reveals considerable locus complexity and provides insights into the different mutational processes that have shaped the human genome. These data provide the first high-resolution sequence map of human structural variation--a standard for genotyping platforms and a prelude to future individual genome sequencing projects.  相似文献   

4.
全基因组测序技术研究及其在木本植物中的应用   总被引:2,自引:0,他引:2  
基因组序列是开展遗传研究重要的信息基础,随着测序技术飞速发展至第3代长片段测序方法,测序读长历经从几十到数万个碱基的提升,对进一步提升基因组组装的完整度以及准确性提供了极大的裨益。现已完成了大量植物种全基因组测序工作,其中木本植物有40多个,还有更多树种的全基因组测序正在进行之中。针对各类测序技术的基因组组装及后续分析,研究人员也开发了大量的生物信息学工具。笔者从测序技术、基因组装技术和全基因组测序生物信息学分析等方面,罗列了目前已完成全基因组测序的木本植物,介绍了全基因组测序技术的发展与应用,以及适用于第3代数据基因组组装的生物学分析软件,为林木基因组研究者提供一定的借鉴。  相似文献   

5.
DNA sequence information underpins genetic research, enabling discoveries of important biological or medical benefit. Sequencing projects have traditionally used long (400-800 base pair) reads, but the existence of reference sequences for the human and many other genomes makes it possible to develop new, fast approaches to re-sequencing, whereby shorter reads are compared to a reference to identify intraspecies genetic variation. Here we report an approach that generates several billion bases of accurate nucleotide sequence per experiment at low cost. Single molecules of DNA are attached to a flat surface, amplified in situ and used as templates for synthetic sequencing with fluorescent reversible terminator deoxyribonucleotides. Images of the surface are analysed to generate high-quality sequence. We demonstrate application of this approach to human genome sequencing on flow-sorted X chromosomes and then scale the approach to determine the genome sequence of a male Yoruba from Ibadan, Nigeria. We build an accurate consensus sequence from >30x average depth of paired 35-base reads. We characterize four million single-nucleotide polymorphisms and four hundred thousand structural variants, many of which were previously unknown. Our approach is effective for accurate, rapid and economical whole-genome re-sequencing and many other biomedical applications.  相似文献   

6.
Sequence identification of 2,375 human brain genes.   总被引:81,自引:0,他引:81  
We recently described a new approach for the rapid characterization of expressed genes by partial DNA sequencing to generate 'expressed sequence tags'. From a set of 600 human brain complementary DNA clones, 348 were informative nuclear-encoded messenger RNAs. We have now partially sequenced 2,672 new, independent cDNA clones isolated from four human brain cDNA libraries to generate 2,375 expressed sequence tags to nuclear-encoded genes. These sequences, together with 348 brain expressed sequence tags from our previous study, comprise more than 2,500 new human genes and 870,769 base pairs of DNA sequence. These data represent an approximate doubling of the number of human genes identified by DNA sequencing and may represent as many as 5% of the genes in the human genome.  相似文献   

7.
Genome sequencing in microfabricated high-density picolitre reactors   总被引:21,自引:0,他引:21  
The proliferation of large-scale DNA-sequencing projects in recent years has driven a search for alternative methods to reduce time and cost. Here we describe a scalable, highly parallel sequencing system with raw throughput significantly greater than that of state-of-the-art capillary electrophoresis instruments. The apparatus uses a novel fibre-optic slide of individual wells and is able to sequence 25 million bases, at 99% or better accuracy, in one four-hour run. To achieve an approximately 100-fold increase in throughput over current Sanger sequencing technology, we have developed an emulsion method for DNA amplification and an instrument for sequencing by synthesis using a pyrosequencing protocol optimized for solid support and picolitre-scale volumes. Here we show the utility, throughput, accuracy and robustness of this system by shotgun sequencing and de novo assembly of the Mycoplasma genitalium genome with 96% coverage at 99.96% accuracy in one run of the machine.  相似文献   

8.
Recent advances in whole-genome sequencing have brought the vision of personal genomics and genomic medicine closer to reality. However, current methods lack clinical accuracy and the ability to describe the context (haplotypes) in which genome variants co-occur in a cost-effective manner. Here we describe a low-cost DNA sequencing and haplotyping process, long fragment read (LFR) technology, which is similar to sequencing long single DNA molecules without cloning or separation of metaphase chromosomes. In this study, ten LFR libraries were made using only ~100?picograms of human DNA per sample. Up to 97% of the heterozygous single nucleotide variants were assembled into long haplotype contigs. Removal of false positive single nucleotide variants not phased by multiple LFR haplotypes resulted in a final genome error rate of 1 in 10?megabases. Cost-effective and accurate genome sequencing and haplotyping from 10-20 human cells, as demonstrated here, will enable comprehensive genetic studies and diverse clinical applications.  相似文献   

9.
Strategies for assembling large, complex genomes have evolved to include a combination of whole-genome shotgun sequencing and hierarchal map-assisted sequencing. Whole-genome maps of all types can aid genome assemblies, generally starting with low-resolution cytogenetic maps and ending with the highest resolution of sequence. Fingerprint clone maps are based upon complete restriction enzyme digests of clones representative of the target genome, and ultimately comprise a near-contiguous path of clones across the genome. Such clone-based maps are used to validate sequence assembly order, supply long-range linking information for assembled sequences, anchor sequences to the genetic map and provide templates for closing gaps. Fingerprint maps are also a critical resource for subsequent functional genomic studies, because they provide a redundant and ordered sampling of the genome with clones. In an accompanying paper we describe the draft genome sequence of the chicken, Gallus gallus, the first species sequenced that is both a model organism and a global food source. Here we present a clone-based physical map of the chicken genome at 20-fold coverage, containing 260 contigs of overlapping clones. This map represents approximately 91% of the chicken genome and enables identification of chicken clones aligned to positions in other sequenced genomes.  相似文献   

10.
R K Saiki  T L Bugawan  G T Horn  K B Mullis  H A Erlich 《Nature》1986,324(6093):163-166
Allelic sequence variation has been analysed by synthetic oligonucleotide hybridization probes which can detect single base substitutions in human genomic DNA. An allele-specific oligonucleotide (ASO) will only anneal to sequences that match it perfectly, a single mismatch being sufficient to prevent hybridization under appropriate conditions. To improve the sensitivity, specificity and simplicity of this approach, we used the polymerase chain reaction (PCR) procedure to enzymatically amplify a specific segment of the beta-globin or HLA-DQ alpha gene in human genomic DNA before hybridization with ASOs. This in vitro amplification method, which produces a greater than 10(5)-fold increase in the amount of target sequence, permits the analysis of allelic variation with as little as 1 ng of genomic DNA and the use of a simple 'dot blot' for probe hybridization. As a further simplification, PCR amplification has been performed directly on crude cell lysates, eliminating the need for DNA purification.  相似文献   

11.
Cancers arise owing to mutations in a subset of genes that confer growth advantage. The availability of the human genome sequence led us to propose that systematic resequencing of cancer genomes for mutations would lead to the discovery of many additional cancer genes. Here we report more than 1,000 somatic mutations found in 274 megabases (Mb) of DNA corresponding to the coding exons of 518 protein kinase genes in 210 diverse human cancers. There was substantial variation in the number and pattern of mutations in individual cancers reflecting different exposures, DNA repair defects and cellular origins. Most somatic mutations are likely to be 'passengers' that do not contribute to oncogenesis. However, there was evidence for 'driver' mutations contributing to the development of the cancers studied in approximately 120 genes. Systematic sequencing of cancer genomes therefore reveals the evolutionary diversity of cancers and implicates a larger repertoire of cancer genes than previously anticipated.  相似文献   

12.
Melanoma is notable for its metastatic propensity, lethality in the advanced setting and association with ultraviolet exposure early in life. To obtain a comprehensive genomic view of melanoma in humans, we sequenced the genomes of 25 metastatic melanomas and matched germline DNA. A wide range of point mutation rates was observed: lowest in melanomas whose primaries arose on non-ultraviolet-exposed hairless skin of the extremities (3 and 14 per megabase (Mb) of genome), intermediate in those originating from hair-bearing skin of the trunk (5-55 per Mb), and highest in a patient with a documented history of chronic sun exposure (111 per Mb). Analysis of whole-genome sequence data identified PREX2 (phosphatidylinositol-3,4,5-trisphosphate-dependent Rac exchange factor 2)--a PTEN-interacting protein and negative regulator of PTEN in breast cancer--as a significantly mutated gene with a mutation frequency of approximately 14% in an independent extension cohort of 107 human melanomas. PREX2 mutations are biologically relevant, as ectopic expression of mutant PREX2 accelerated tumour formation of immortalized human melanocytes in vivo. Thus, whole-genome sequencing of human melanoma tumours revealed genomic evidence of ultraviolet pathogenesis and discovered a new recurrently mutated gene in melanoma.  相似文献   

13.
Tumour evolution inferred by single-cell sequencing   总被引:1,自引:0,他引:1  
Genomic analysis provides insights into the role of copy number variation in disease, but most methods are not designed to resolve mixed populations of cells. In tumours, where genetic heterogeneity is common, very important information may be lost that would be useful for reconstructing evolutionary history. Here we show that with flow-sorted nuclei, whole genome amplification and next generation sequencing we can accurately quantify genomic copy number within an individual nucleus. We apply single-nucleus sequencing to investigate tumour population structure and evolution in two human breast cancer cases. Analysis of 100 single cells from a polygenomic tumour revealed three distinct clonal subpopulations that probably represent sequential clonal expansions. Additional analysis of 100 single cells from a monogenomic primary tumour and its liver metastasis indicated that a single clonal expansion formed the primary tumour and seeded the metastasis. In both primary tumours, we also identified an unexpectedly abundant subpopulation of genetically diverse 'pseudodiploid' cells that do not travel to the metastatic site. In contrast to gradual models of tumour progression, our data indicate that tumours grow by punctuated clonal expansions with few persistent intermediates.  相似文献   

14.
15.
分子育种是指利用与性状相关的DNA标记进行选育,也称标记辅助选择或标记辅助育种,广义上还包括基因工程育种和基因组学辅助育种。林木分子育种为早期选择和加速育种提供了极具潜力的高效手段。笔者对林木分子育种研究的基因组学信息资源进行了进展综述和前景展望。近30年来,林木分子标记技术从早期的低通量方法发展到目前基于微阵列芯片和新一代测序的高通量技术,如测序分型、转录组测序、重测序、扩增子测序和外显子组测序等,并广泛用于连锁作图、关联分析和基因组选择等林木性状相关的DNA变异检测研究。随着2006年毛果杨基因组序列的发表,已有50余个树种完成了基因组测序。基于连锁作图和关联研究检测了林木10余个属生长、材性和抗逆及非木质产品品质等性状相关的大量基因组位点,主要趋势表现为:① 表型广泛,涵盖经济性状、生理指标和代谢成分等;②标记数量成千上万甚至上百万,覆盖全基因组;③转录组和降解组等多组学的分子变异开始应用;④ 利用大群体以提高位点检测的精度;⑤ 重视环境的影响,大田试验设置多个地点,解析QTL与环境、年份的互作效应;⑥ 结合参考基因组序列和/或转录组差异表达基因进一步挖掘性状相关的候选基因,建立了桉属、松属和云杉属等主要造林树种的基因组选择模型。此外,积累了泛基因组、相关软件和算法、功能基因、基因组编辑技术及网站和数据库等其他信息资源。林木分子育种面临的挑战主要包括:① 如何获得稳定性好的性状相关基因组位点和基因组选择(GS)模型;② 缺乏自动化、无损和高通量的表型测定技术;③对大基因组的针叶树和一些多倍体树种,仍难获得高质量的基因组序列;④ 标记辅助选择增加了常规育种之外的费用,且存在不确定性;⑤多数树种的加速育种仍较困难。后基因组时代的林木分子育种将有效结合到常规育种程序中,显著促进遗传增益的提高。  相似文献   

16.
17.
Guide to the draft human genome   总被引:5,自引:0,他引:5  
Wolfsberg TG  McEntyre J  Schuler GD 《Nature》2001,409(6822):824-826
There are a number of ways to investigate the structure, function and evolution of the human genome. These include examining the morphology of normal and abnormal chromosomes, constructing maps of genomic landmarks, following the genetic transmission of phenotypes and DNA sequence variations, and characterizing thousands of individual genes. To this list we can now add the elucidation of the genomic DNA sequence, albeit at 'working draft' accuracy. The current challenge is to weave together these disparate types of data to produce the information infrastructure needed to support the next generation of biomedical research. Here we provide an overview of the different sources of information about the human genome and how modern information technology, in particular the internet, allows us to link them together.  相似文献   

18.
We present a global comparison of differences in content of segmental duplication between human and chimpanzee, and determine that 33% of human duplications (> 94% sequence identity) are not duplicated in chimpanzee, including some human disease-causing duplications. Combining experimental and computational approaches, we estimate a genomic duplication rate of 4-5 megabases per million years since divergence. These changes have resulted in gene expression differences between the species. In terms of numbers of base pairs affected, we determine that de novo duplication has contributed most significantly to differences between the species, followed by deletion of ancestral duplications. Post-speciation gene conversion accounts for less than 10% of recent segmental duplication. Chimpanzee-specific hyperexpansion (> 100 copies) of particular segments of DNA have resulted in marked quantitative differences and alterations in the genome landscape between chimpanzee and human. Almost all of the most extreme differences relate to changes in chromosome structure, including the emergence of African great ape subterminal heterochromatin. Nevertheless, base per base, large segmental duplication events have had a greater impact (2.7%) in altering the genomic landscape of these two species than single-base-pair substitution (1.2%).  相似文献   

19.
The systematic comparison of genomic sequences from different organisms represents a central focus of contemporary genome analysis. Comparative analyses of vertebrate sequences can identify coding and conserved non-coding regions, including regulatory elements, and provide insight into the forces that have rendered modern-day genomes. As a complement to whole-genome sequencing efforts, we are sequencing and comparing targeted genomic regions in multiple, evolutionarily diverse vertebrates. Here we report the generation and analysis of over 12 megabases (Mb) of sequence from 12 species, all derived from the genomic region orthologous to a segment of about 1.8 Mb on human chromosome 7 containing ten genes, including the gene mutated in cystic fibrosis. These sequences show conservation reflecting both functional constraints and the neutral mutational events that shaped this genomic region. In particular, we identify substantial numbers of conserved non-coding segments beyond those previously identified experimentally, most of which are not detectable by pair-wise sequence comparisons alone. Analysis of transposable element insertions highlights the variation in genome dynamics among these species and confirms the placement of rodents as a sister group to the primates.  相似文献   

20.
Most genomic variation is attributable to single nucleotide polymorphisms (SNPs), which therefore offer the highest resolution for tracking disease genes and population history. It has been proposed that a dense map of 30,000-500,000 SNPs can be used to scan the human genome for haplotypes associated with common diseases. Here we describe a simple but powerful method, called reduced representation shotgun (RRS) sequencing, for creating SNP maps. RRS re-samples specific subsets of the genome from several individuals, and compares the resulting sequences using a highly accurate SNP detection algorithm. The method can be extended by alignment to available genome sequence, increasing the yield of SNPs and providing map positions. These methods are being used by The SNP Consortium, an international collaboration of academic centres, pharmaceutical companies and a private foundation, to discover and release at least 300,000 human SNPs. We have discovered 47,172 human SNPs by RRS, and in total the Consortium has identified 148,459 SNPs. More broadly, RRS facilitates the rapid, inexpensive construction of SNP maps in biomedically and agriculturally important species. SNPs discovered by RRS also offer unique advantages for large-scale genotyping.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号