首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 151 毫秒
1.
Detection of large-scale variation in the human genome   总被引:26,自引:0,他引:26  
We identified 255 loci across the human genome that contain genomic imbalances among unrelated individuals. Twenty-four variants are present in > 10% of the individuals that we examined. Half of these regions overlap with genes, and many coincide with segmental duplications or gaps in the human genome assembly. This previously unappreciated heterogeneity may underlie certain human phenotypic variation and susceptibility to disease and argues for a more dynamic human genome structure.  相似文献   

2.
3.
P. cynomolgi, a malaria-causing parasite of Asian Old World monkeys, is the sister taxon of P. vivax, the most prevalent malaria-causing species in humans outside of Africa. Because P. cynomolgi shares many phenotypic, biological and genetic characteristics with P. vivax, we generated draft genome sequences for three P. cynomolgi strains and performed genomic analysis comparing them with the P. vivax genome, as well as with the genome of a third previously sequenced simian parasite, Plasmodium knowlesi. Here, we show that genomes of the monkey malaria clade can be characterized by copy-number variants (CNVs) in multigene families involved in evasion of the human immune system and invasion of host erythrocytes. We identify genome-wide SNPs, microsatellites and CNVs in the P. cynomolgi genome, providing a map of genetic variation that can be used to map parasite traits and study parasite populations. The sequencing of the P. cynomolgi genome is a critical step in developing a model system for P. vivax research and in counteracting the neglect of P. vivax.  相似文献   

4.
To test the hypothesis that the human genome project will uncover many genes not previously discovered by sequencing of expressed sequence tags (ESTs), we designed and produced a set of microarrays using probes based on open reading frames (ORFs) in 350 Mb of finished and draft human sequence. Our approach aims to identify all genes directly from genomic sequence by querying gene expression. We analysed genomic sequence with a suite of ORF prediction programs, selected approximately one ORF per gene, amplified the ORFs from genomic DNA and arrayed the amplicons onto treated glass slides. Of the first 10,000 arrayed ORFs, 31% are completely novel and 29% are similar, but not identical, to sequences in public databases. Approximately one-half of these are expressed in the tissues we queried by microarray. Subsequent verification by other techniques confirmed expression of several of the novel genes. Expressed sequence tags (ESTs) have yielded vast amounts of data, but our results indicate that many genes in the human genome will only be found by genomic sequencing.  相似文献   

5.
To understand the genetic heterogeneity underlying developmental delay, we compared copy number variants (CNVs) in 15,767 children with intellectual disability and various congenital defects (cases) to CNVs in 8,329 unaffected adult controls. We estimate that ~14.2% of disease in these children is caused by CNVs >400 kb. We observed a greater enrichment of CNVs in individuals with craniofacial anomalies and cardiovascular defects compared to those with epilepsy or autism. We identified 59 pathogenic CNVs, including 14 new or previously weakly supported candidates, refined the critical interval for several genomic disorders, such as the 17q21.31 microdeletion syndrome, and identified 940 candidate dosage-sensitive genes. We also developed methods to opportunistically discover small, disruptive CNVs within the large and growing diagnostic array datasets. This evolving CNV morbidity map, combined with exome and genome sequencing, will be critical for deciphering the genetic basis of developmental delay, intellectual disability and autism spectrum disorders.  相似文献   

6.
We report the analysis of a Japanese male using high-throughput sequencing to × 40 coverage. More than 99% of the sequence reads were mapped to the reference human genome. Using a Bayesian decision method, we identified 3,132,608 single nucleotide variations (SNVs). Comparison with six previously reported genomes revealed an excess of singleton nonsense and nonsynonymous SNVs, as well as singleton SNVs in conserved non-coding regions. We also identified 5,319 deletions smaller than 10 kb with high accuracy, in addition to copy number variations and rearrangements. De novo assembly of the unmapped sequence reads generated around 3 Mb of novel sequence, which showed high similarity to non-reference human genomes and the human herpesvirus 4 genome. Our analysis suggests that considerable variation remains undiscovered in the human genome and that whole-genome sequencing is an invaluable tool for obtaining a complete understanding of human genetic variation.  相似文献   

7.
Inversions, deletions and insertions are important mediators of disease and disease susceptibility. We systematically compared the human genome reference sequence with a second genome (represented by fosmid paired-end sequences) to detect intermediate-sized structural variants >8 kb in length. We identified 297 sites of structural variation: 139 insertions, 102 deletions and 56 inversion breakpoints. Using combined literature, sequence and experimental analyses, we validated 112 of the structural variants, including several that are of biomedical relevance. These data provide a fine-scale structural variation map of the human genome and the requisite sequence precision for subsequent genetic studies of human disease.  相似文献   

8.
Detecting genetic variants that are highly divergent from a reference sequence remains a major challenge in genome sequencing. We introduce de novo assembly algorithms using colored de Bruijn graphs for detecting and genotyping simple and complex genetic variants in an individual or population. We provide an efficient software implementation, Cortex, the first de novo assembler capable of assembling multiple eukaryotic genomes simultaneously. Four applications of Cortex are presented. First, we detect and validate both simple and complex structural variations in a high-coverage human genome. Second, we identify more than 3 Mb of sequence absent from the human reference genome, in pooled low-coverage population sequence data from the 1000 Genomes Project. Third, we show how population information from ten chimpanzees enables accurate variant calls without a reference sequence. Last, we estimate classical human leukocyte antigen (HLA) genotypes at HLA-B, the most variable gene in the human genome.  相似文献   

9.
We constructed a tiling resolution array consisting of 32,433 overlapping BAC clones covering the entire human genome. This increases our ability to identify genetic alterations and their boundaries throughout the genome in a single comparative genomic hybridization (CGH) experiment. At this tiling resolution, we identified minute DNA alterations not previously reported. These alterations include microamplifications and deletions containing oncogenes, tumor-suppressor genes and new genes that may be associated with multiple tumor types. Our findings show the need to move beyond conventional marker-based genome comparison approaches, that rely on inference of continuity between interval markers. Our submegabase resolution tiling set for array CGH (SMRT array) allows comprehensive assessment of genomic integrity and thereby the identification of new genes associated with disease.  相似文献   

10.
Genomic disorders are characterized by the presence of flanking segmental duplications that predispose these regions to recurrent rearrangement. Based on the duplication architecture of the genome, we investigated 130 regions that we hypothesized as candidates for previously undescribed genomic disorders. We tested 290 individuals with mental retardation by BAC array comparative genomic hybridization and identified 16 pathogenic rearrangements, including de novo microdeletions of 17q21.31 found in four individuals. Using oligonucleotide arrays, we refined the breakpoints of this microdeletion, defining a 478-kb critical region containing six genes that were deleted in all four individuals. We mapped the breakpoints of this deletion and of four other pathogenic rearrangements in 1q21.1, 15q13, 15q24 and 17q12 to flanking segmental duplications, suggesting that these are also sites of recurrent rearrangement. In common with the 17q21.31 deletion, these breakpoint regions are sites of copy number polymorphism in controls, indicating that these may be inherently unstable genomic regions.  相似文献   

11.
Sequence variation in the human angiotensin converting enzyme.   总被引:32,自引:0,他引:32  
Angiotensin converting enzyme (encoded by the gene DCP1, also known as ACE) catalyses the conversion of angiotensin I to the physiologically active peptide angiotensin II, which controls fluid-electrolyte balance and systemic blood pressure. Because of its key function in the renin-angiotensin system, many association studies have been performed with DCP1. Nearly all studies have associated the presence (insertion, I) or absence (deletion, D) of a 287-bp Alu repeat element in intron 16 with the levels of circulating enzyme or cardiovascular pathophysiologies. Many epidemiological studies suggest that the DCP1*D allele confers increased susceptibility to cardiovascular disease; however, other reports have found no such association or even a beneficial effect. We present here the complete genomic sequence of DCP1 from 11 individuals, representing the longest contiguous scan (24 kb) for sequence variation in human DNA. We identified 78 varying sites in 22 chromosomes that resolved into 13 distinct haplotypes. Of the variant sites, 17 were in absolute linkage disequilibrium with the commonly typed Alu insertion/deletion polymorphism, producing two distinct and distantly related clades. We also identified a major subdivision in the Alu deletion clade that enables further analysis of the traits associated with this gene. The diversity uncovered in DCP1 is comparable to that described for other regions in the human genome. The highly correlated structure in DCP1 raises important issues for the determination of functional DNA variants within genes and genetic studies in humans based on marker association.  相似文献   

12.
Genetic variation allows the malaria parasite Plasmodium falciparum to overcome chemotherapeutic agents, vaccines and vector control strategies and remain a leading cause of global morbidity and mortality. Here we describe an initial survey of genetic variation across the P. falciparum genome. We performed extensive sequencing of 16 geographically diverse parasites and identified 46,937 SNPs, demonstrating rich diversity among P. falciparum parasites (pi = 1.16 x 10(-3)) and strong correlation with gene function. We identified multiple regions with signatures of selective sweeps in drug-resistant parasites, including a previously unidentified 160-kb region with extremely low polymorphism in pyrimethamine-resistant parasites. We further characterized 54 worldwide isolates by genotyping SNPs across 20 genomic regions. These data begin to define population structure among African, Asian and American groups and illustrate the degree of linkage disequilibrium, which extends over relatively short distances in African parasites but over longer distances in Asian parasites. We provide an initial map of genetic diversity in P. falciparum and demonstrate its potential utility in identifying genes subject to recent natural selection and in understanding the population genetics of this parasite.  相似文献   

13.
14.
15.
Numerous types of DNA variation exist, ranging from SNPs to larger structural alterations such as copy number variants (CNVs) and inversions. Alignment of DNA sequence from different sources has been used to identify SNPs and intermediate-sized variants (ISVs). However, only a small proportion of total heterogeneity is characterized, and little is known of the characteristics of most smaller-sized (<50 kb) variants. Here we show that genome assembly comparison is a robust approach for identification of all classes of genetic variation. Through comparison of two human assemblies (Celera's R27c compilation and the Build 35 reference sequence), we identified megabases of sequence (in the form of 13,534 putative non-SNP events) that were absent, inverted or polymorphic in one assembly. Database comparison and laboratory experimentation further demonstrated overlap or validation for 240 variable regions and confirmed >1.5 million SNPs. Some differences were simple insertions and deletions, but in regions containing CNVs, segmental duplication and repetitive DNA, they were more complex. Our results uncover substantial undescribed variation in humans, highlighting the need for comprehensive annotation strategies to fully interpret genome scanning and personalized sequencing projects.  相似文献   

16.
17.
18.
Substantial efforts are focused on identifying single-nucleotide polymorphisms (SNPs) throughout the human genome, particularly in coding regions (cSNPs), for both linkage disequilibrium and association studies. Less attention, however, has been directed to the clarification of evolutionary processes that are responsible for the variability in nucleotide diversity among different regions of the genome. We report here the population sequence diversity of genomic segments within a 450-kb cluster of olfactory receptor (OR) genes on human chromosome 17. We found a dichotomy in the pattern of nucleotide diversity between OR pseudogenes and introns on the one hand and the closely interspersed intact genes on the other. We suggest that weak positive selection is responsible for the observed patterns of genetic variation. This is inferred from a lower ratio of polymorphism to divergence in genes compared with pseudogenes or introns, high non-synonymous substitution rates in OR genes, and a small but significant overall reduction in variability in the entire OR gene cluster compared with other genomic regions. The dichotomy among functionally different segments within a short genomic distance requires high recombination rates within this OR cluster. Our work demonstrates the impact of weak positive selection on human nucleotide diversity, and has implications for the evolution of the olfactory repertoire.  相似文献   

19.
Many sequence variants affecting diversity of adult human height   总被引:1,自引:0,他引:1  
Adult human height is one of the classical complex human traits. We searched for sequence variants that affect height by scanning the genomes of 25,174 Icelanders, 2,876 Dutch, 1,770 European Americans and 1,148 African Americans. We then combined these results with previously published results from the Diabetes Genetics Initiative on 3,024 Scandinavians and tested a selected subset of SNPs in 5,517 Danes. We identified 27 regions of the genome with one or more sequence variants showing significant association with height. The estimated effects per allele of these variants ranged between 0.3 and 0.6 cm and, taken together, they explain around 3.7% of the population variation in height. The genes neighboring the identified loci cluster in biological processes related to skeletal development and mitosis. Association to three previously reported loci are replicated in our analyses, and the strongest association was with SNPs in the ZBTB38 gene.  相似文献   

20.
Genome-wide association studies of 14 agronomic traits in rice landraces   总被引:20,自引:0,他引:20  
Huang X  Wei X  Sang T  Zhao Q  Feng Q  Zhao Y  Li C  Zhu C  Lu T  Zhang Z  Li M  Fan D  Guo Y  Wang A  Wang L  Deng L  Li W  Lu Y  Weng Q  Liu K  Huang T  Zhou T  Jing Y  Li W  Lin Z  Buckler ES  Qian Q  Zhang QF  Li J  Han B 《Nature genetics》2010,42(11):961-967
Uncovering the genetic basis of agronomic traits in crop landraces that have adapted to various agro-climatic conditions is important to world food security. Here we have identified ~ 3.6 million SNPs by sequencing 517 rice landraces and constructed a high-density haplotype map of the rice genome using a novel data-imputation method. We performed genome-wide association studies (GWAS) for 14 agronomic traits in the population of Oryza sativa indica subspecies. The loci identified through GWAS explained ~ 36% of the phenotypic variance, on average. The peak signals at six loci were tied closely to previously identified genes. This study provides a fundamental resource for rice genetics research and breeding, and demonstrates that an approach integrating second-generation genome sequencing and GWAS can be used as a powerful complementary strategy to classical biparental cross-mapping for dissecting complex traits in rice.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号