首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 406 毫秒
1.
Inversions, deletions and insertions are important mediators of disease and disease susceptibility. We systematically compared the human genome reference sequence with a second genome (represented by fosmid paired-end sequences) to detect intermediate-sized structural variants >8 kb in length. We identified 297 sites of structural variation: 139 insertions, 102 deletions and 56 inversion breakpoints. Using combined literature, sequence and experimental analyses, we validated 112 of the structural variants, including several that are of biomedical relevance. These data provide a fine-scale structural variation map of the human genome and the requisite sequence precision for subsequent genetic studies of human disease.  相似文献   

2.
Huang X  Zhao Y  Wei X  Li C  Wang A  Zhao Q  Li W  Guo Y  Deng L  Zhu C  Fan D  Lu Y  Weng Q  Liu K  Zhou T  Jing Y  Si L  Dong G  Huang T  Lu T  Feng Q  Qian Q  Li J  Han B 《Nature genetics》2012,44(1):32-39
A high-density haplotype map recently enabled a genome-wide association study (GWAS) in a population of indica subspecies of Chinese rice landraces. Here we extend this methodology to a larger and more diverse sample of 950 worldwide rice varieties, including the Oryza sativa indica and Oryza sativa japonica subspecies, to perform an additional GWAS. We identified a total of 32 new loci associated with flowering time and with ten grain-related traits, indicating that the larger sample increased the power to detect trait-associated variants using GWAS. To characterize various alleles and complex genetic variation, we developed an analytical framework for haplotype-based de novo assembly of the low-coverage sequencing data in rice. We identified candidate genes for 18 associated loci through detailed annotation. This study shows that the integrated approach of sequence-based GWAS and functional genome annotation has the potential to match complex traits to their causal polymorphisms in rice.  相似文献   

3.
Numerous types of DNA variation exist, ranging from SNPs to larger structural alterations such as copy number variants (CNVs) and inversions. Alignment of DNA sequence from different sources has been used to identify SNPs and intermediate-sized variants (ISVs). However, only a small proportion of total heterogeneity is characterized, and little is known of the characteristics of most smaller-sized (<50 kb) variants. Here we show that genome assembly comparison is a robust approach for identification of all classes of genetic variation. Through comparison of two human assemblies (Celera's R27c compilation and the Build 35 reference sequence), we identified megabases of sequence (in the form of 13,534 putative non-SNP events) that were absent, inverted or polymorphic in one assembly. Database comparison and laboratory experimentation further demonstrated overlap or validation for 240 variable regions and confirmed >1.5 million SNPs. Some differences were simple insertions and deletions, but in regions containing CNVs, segmental duplication and repetitive DNA, they were more complex. Our results uncover substantial undescribed variation in humans, highlighting the need for comprehensive annotation strategies to fully interpret genome scanning and personalized sequencing projects.  相似文献   

4.
We report the analysis of a Japanese male using high-throughput sequencing to × 40 coverage. More than 99% of the sequence reads were mapped to the reference human genome. Using a Bayesian decision method, we identified 3,132,608 single nucleotide variations (SNVs). Comparison with six previously reported genomes revealed an excess of singleton nonsense and nonsynonymous SNVs, as well as singleton SNVs in conserved non-coding regions. We also identified 5,319 deletions smaller than 10 kb with high accuracy, in addition to copy number variations and rearrangements. De novo assembly of the unmapped sequence reads generated around 3 Mb of novel sequence, which showed high similarity to non-reference human genomes and the human herpesvirus 4 genome. Our analysis suggests that considerable variation remains undiscovered in the human genome and that whole-genome sequencing is an invaluable tool for obtaining a complete understanding of human genetic variation.  相似文献   

5.
Variation in the human genome sequence is key to understanding susceptibility to disease in modern populations and the history of ancestral populations. Unlocking this information requires knowledge of the patterns and underlying causes of human sequence diversity. By applying a new population-genetic framework to two genome-wide polymorphism surveys, we find that the human genome contains sizeable regions (stretching over tens of thousands of base pairs) that have intrinsically high and low rates of sequence variation. We show that the primary determinant of these patterns is shared genealogical history. Only a fraction of the variation (at most 25%) is due to the local mutation rate. By measuring the average distance over which genealogical histories are typically preserved, these data provide the first genome-wide estimate of the average extent of correlation among variants (linkage disequilibrium). The results are best explained by extreme variability in the recombination rate at a fine scale, and provide the first empirical evidence that such recombination 'hot spots' are a general feature of the human genome and have a principal role in shaping genetic variation in the human population.  相似文献   

6.
Many sequence variants affecting diversity of adult human height   总被引:1,自引:0,他引:1  
Adult human height is one of the classical complex human traits. We searched for sequence variants that affect height by scanning the genomes of 25,174 Icelanders, 2,876 Dutch, 1,770 European Americans and 1,148 African Americans. We then combined these results with previously published results from the Diabetes Genetics Initiative on 3,024 Scandinavians and tested a selected subset of SNPs in 5,517 Danes. We identified 27 regions of the genome with one or more sequence variants showing significant association with height. The estimated effects per allele of these variants ranged between 0.3 and 0.6 cm and, taken together, they explain around 3.7% of the population variation in height. The genes neighboring the identified loci cluster in biological processes related to skeletal development and mitosis. Association to three previously reported loci are replicated in our analyses, and the strongest association was with SNPs in the ZBTB38 gene.  相似文献   

7.
Radiation hybrid map of the mouse genome.   总被引:13,自引:0,他引:13  
Radiation hybrid (RH) maps are a useful tool for genome analysis, providing a direct method for localizing genes and anchoring physical maps and genomic sequence along chromosomes. The construction of a comprehensive RH map for the human genome has resulted in gene maps reflecting the location of more than 30,000 human genes. Here we report the first comprehensive RH map of the mouse genome. The map contains 2,486 loci screened against an RH panel of 93 cell lines. Most loci (93%) are simple sequence length polymorphisms (SSLPs) taken from the mouse genetic map, thereby providing direct integration between these two key maps. We performed RH mapping by a new and efficient approach in which we replaced traditional gel- or hybridization-based assays by a homogeneous 5'-nuclease assays involving a single common probe for all genetic markers. The map provides essentially complete connectivity and coverage across the genome, and good resolution for ordering loci, with 1 centiRay (cR) corresponding to an average of approximately 100 kb. The RH map, together with an accompanying World-Wide Web server, makes it possible for any investigator to rapidly localize sequences in the mouse genome. Together with the previously constructed genetic map and a YAC-based physical map reported in a companion paper, the fundamental maps required for mouse genomics are now available.  相似文献   

8.
The genome of the extremophile crucifer Thellungiella parvula   总被引:1,自引:0,他引:1  
Thellungiella parvula is related to Arabidopsis thaliana and is endemic to saline, resource-poor habitats, making it a model for the evolution of plant adaptation to extreme environments. Here we present the draft genome for this extremophile species. Exclusively by next generation sequencing, we obtained the de novo assembled genome in 1,496 gap-free contigs, closely approximating the estimated genome size of 140 Mb. We anchored these contigs to seven pseudo chromosomes without the use of maps. We show that short reads can be assembled to a near-complete chromosome level for a eukaryotic species lacking prior genetic information. The sequence identifies a number of tandem duplications that, by the nature of the duplicated genes, suggest a possible basis for T. parvula's extremophile lifestyle. Our results provide essential background for developing genomically influenced testable hypotheses for the evolution of environmental stress tolerance.  相似文献   

9.
The per-generation mutation rate in humans is high. De novo mutations may compensate for allele loss due to severely reduced fecundity in common neurodevelopmental and psychiatric diseases, explaining a major paradox in evolutionary genetic theory. Here we used a family based exome sequencing approach to test this de novo mutation hypothesis in ten individuals with unexplained mental retardation. We identified and validated unique non-synonymous de novo mutations in nine genes. Six of these, identified in six different individuals, are likely to be pathogenic based on gene function, evolutionary conservation and mutation impact. Our findings provide strong experimental support for a de novo paradigm for mental retardation. Together with de novo copy number variation, de novo point mutations of large effect could explain the majority of all mental retardation cases in the population.  相似文献   

10.
Most human sequence variation is in the form of single-nucleotide polymorphisms (SNPs). It has been proposed that coding-region SNPs (cSNPs) be used for direct association studies to determine the genetic basis of complex traits. The success of such studies depends on the frequency of disease-associated alleles, and their distribution in different ethnic populations. If disease-associated alleles are frequent in most populations, then direct genotyping of candidate variants could show robust associations in manageable study samples. This approach is less feasible if the genetic risk from a given candidate gene is due to many infrequent alleles. Previous studies of several genes demonstrated that most variants are relatively infrequent (<0.05). These surveys genotyped small samples (n<75) and thus had limited ability to identify rare alleles. Here we evaluate the prevalence and distribution of such rare alleles by genotyping an ethnically diverse reference sample that is more than six times larger than those used in previous studies (n=450). We screened for variants in the complete coding sequence and intron-exon junctions of two candidate genes for neuropsychiatric phenotypes: SLC6A4, encoding the serotonin transporter; and SLC18A2, encoding the vesicular monoamine transporter. Both genes have unique roles in neuronal transmission, and variants in either gene might be associated with neurobehavioral phenotypes.  相似文献   

11.
Despite its high heritability, a large fraction of individuals with schizophrenia do not have a family history of the disease (sporadic cases). Here we examined the possibility that rare de novo protein-altering mutations contribute to the genetic component of schizophrenia by sequencing the exomes of 53 sporadic cases, 22 unaffected controls and their parents. We identified 40 de novo mutations in 27 cases affecting 40 genes, including a potentially disruptive mutation in DGCR2, a gene located in the schizophrenia-predisposing 22q11.2 microdeletion region. A comparison to rare inherited variants indicated that the identified de novo mutations show a large excess of non-synonymous changes in schizophrenia cases, as well as a greater potential to affect protein structure and function. Our analyses suggest a major role for de novo mutations in schizophrenia as well as a large mutational target, which together provide a plausible explanation for the high global incidence and persistence of the disease.  相似文献   

12.
Many genes associated with CpG islands undergo de novo methylation in cancer. Studies have suggested that the pattern of this modification may be partially determined by an instructive mechanism that recognizes specifically marked regions of the genome. Using chromatin immunoprecipitation analysis, here we show that genes methylated in cancer cells are specifically packaged with nucleosomes containing histone H3 trimethylated on Lys27. This chromatin mark is established on these unmethylated CpG island genes early in development and then maintained in differentiated cell types by the presence of an EZH2-containing Polycomb complex. In cancer cells, as opposed to normal cells, the presence of this complex brings about the recruitment of DNA methyl transferases, leading to de novo methylation. These results suggest that tumor-specific targeting of de novo methylation is pre-programmed by an established epigenetic system that normally has a role in marking embryonic genes for repression.  相似文献   

13.
14.
《Nature genetics》2006,38(9):959
Common genomic structural variants predispose to deleterious de novo genomic rearrangements. Understanding how they do so will require population studies across the continuum of genomic variation and ethical discussion of the nature and uses of human variation.  相似文献   

15.
16.
DNA methylation represses transcription in vivo.   总被引:9,自引:0,他引:9  
  相似文献   

17.
Natural selection on human microRNA binding sites inferred from SNP data   总被引:1,自引:0,他引:1  
Chen K  Rajewsky N 《Nature genetics》2006,38(12):1452-1456
  相似文献   

18.
19.
Variation in DNA sequence contributes to individual differences in quantitative traits, but in humans the specific sequence variants are known for very few traits. We characterized variation in gene expression in cells from individuals belonging to three major population groups. This quantitative phenotype differs significantly between European-derived and Asian-derived populations for 1,097 of 4,197 genes tested. For the phenotypes with the strongest evidence of cis determinants, most of the variation is due to allele frequency differences at cis-linked regulators. The results show that specific genetic variation among populations contributes appreciably to differences in gene expression phenotypes. Populations differ in prevalence of many complex genetic diseases, such as diabetes and cardiovascular disease. As some of these are probably influenced by the level of gene expression, our results suggest that allele frequency differences at regulatory polymorphisms also account for some population differences in prevalence of complex diseases.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号