首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 281 毫秒
1.
Lam HM  Xu X  Liu X  Chen W  Yang G  Wong FL  Li MW  He W  Qin N  Wang B  Li J  Jian M  Wang J  Shao G  Wang J  Sun SS  Zhang G 《Nature genetics》2010,42(12):1053-1059
We report a large-scale analysis of the patterns of genome-wide genetic variation in soybeans. We re-sequenced a total of 17 wild and 14 cultivated soybean genomes to an average of approximately ×5 depth and >90% coverage using the Illumina Genome Analyzer II platform. We compared the patterns of genetic variation between wild and cultivated soybeans and identified higher allelic diversity in wild soybeans. We identified a high level of linkage disequilibrium in the soybean genome, suggesting that marker-assisted breeding of soybean will be less challenging than map-based cloning. We report linkage disequilibrium block location and distribution, and we identified a set of 205,614 tag SNPs that may be useful for QTL mapping and association studies. The data here provide a valuable resource for the analysis of wild soybeans and to facilitate future breeding and quantitative trait analysis.  相似文献   

2.
A general approach to single-nucleotide polymorphism discovery   总被引:29,自引:0,他引:29  
Single-nucleotide polymorphisms (SNPs) are the most abundant form of human genetic variation and a resource for mapping complex genetic traits. The large volume of data produced by high-throughput sequencing projects is a rich and largely untapped source of SNPs (refs 2, 3, 4, 5). We present here a unified approach to the discovery of variations in genetic sequence data of arbitrary DNA sources. We propose to use the rapidly emerging genomic sequence as a template on which to layer often unmapped, fragmentary sequence data and to use base quality values to discern true allelic variations from sequencing errors. By taking advantage of the genomic sequence we are able to use simpler yet more accurate methods for sequence organization: fragment clustering, paralogue identification and multiple alignment. We analyse these sequences with a novel, Bayesian inference engine, POLYBAYES, to calculate the probability that a given site is polymorphic. Rigorous treatment of base quality permits completely automated evaluation of the full length of all sequences, without limitations on alignment depth. We demonstrate this approach by accurate SNP predictions in human ESTs aligned to finished and working-draft quality genomic sequences, a data set representative of the typical challenges of sequence-based SNP discovery.  相似文献   

3.
Single-nucleotide polymorphisms (SNPs) have been the focus of much attention in human genetics because they are extremely abundant and well-suited for automated large-scale genotyping. Human SNPs, however, are less informative than other types of genetic markers (such as simple-sequence length polymorphisms or microsatellites) and thus more loci are required for mapping traits. SNPs offer similar advantages for experimental genetic organisms such as the mouse, but they entail no loss of informativeness because bi-allelic markers are fully informative in analysing crosses between inbred strains. Here we report a large-scale analysis of SNPs in the mouse genome. We characterized the rate of nucleotide polymorphism in eight mouse strains and identified a collection of 2,848 SNPs located in 1,755 sequence-tagged sites (STSs) using high-density oligonucleotide arrays. Three-quarters of these SNPs have been mapped on the mouse genome, providing a first-generation SNP map of the mouse. We have also developed a multiplex genotyping procedure by which a genome scan can be performed with only six genotyping reactions per animal.  相似文献   

4.
Genetic mapping with SNP markers in Drosophila.   总被引:10,自引:0,他引:10  
Map-based positional cloning of Drosophila melanogaster genes is hampered by both the time-consuming, error-prone nature of traditional methods for genetic mapping and the difficulties in aligning the genetic and cytological maps with the genome sequence. The identification of sequence polymorphisms in the Drosophila genome will make it possible to map mutations directly to the genome sequence with high accuracy and resolution. Here we report the identification of 7,223 single-nucleotide polymorphisms (SNPs) and 1,392 insertions/deletions (InDels) in common laboratory strains of Drosophila. These sequence polymorphisms define a map of 787 autosomal marker loci with a resolution of 114 kb. We have established PCR product-length polymorphism (PLP) or restriction fragment-length polymorphism (RFLP) assays for 215 of these markers. We demonstrate the use of this map by delimiting two mutations to intervals of 169 kb and 307 kb, respectively. Using a local high-density SNP map, we also mapped a third mutation to a resolution of approximately 2 kb, sufficient to localize the mutation within a single gene. These methods should accelerate the rate of positional cloning in Drosophila.  相似文献   

5.
Integration of genome-wide expression profiling with linkage analysis is a new approach to identifying genes underlying complex traits. We applied this approach to the regulation of gene expression in the BXH/HXB panel of rat recombinant inbred strains, one of the largest available rodent recombinant inbred panels and a leading resource for genetic analysis of the highly prevalent metabolic syndrome. In two tissues important to the pathogenesis of the metabolic syndrome, we mapped cis- and trans-regulatory control elements for expression of thousands of genes across the genome. Many of the most highly linked expression quantitative trait loci are regulated in cis, are inherited essentially as monogenic traits and are good candidate genes for previously mapped physiological quantitative trait loci in the rat. By comparative mapping we generated a data set of 73 candidate genes for hypertension that merit testing in human populations. Mining of this publicly available data set is expected to lead to new insights into the genes and regulatory pathways underlying the extensive range of metabolic and cardiovascular disease phenotypes that segregate in these recombinant inbred strains.  相似文献   

6.
Recombination and linkage disequilibrium in Arabidopsis thaliana   总被引:4,自引:0,他引:4  
Linkage disequilibrium (LD) is a major aspect of the organization of genetic variation in natural populations. Here we describe the genome-wide pattern of LD in a sample of 19 Arabidopsis thaliana accessions using 341,602 non-singleton SNPs. LD decays within 10 kb on average, considerably faster than previously estimated. Tag SNP selection algorithms and 'hide-the-SNP' simulations suggest that genome-wide association mapping will require only 40%-50% of the observed SNPs, a reduction similar to estimates in a sample of African Americans. An Affymetrix genotyping array containing 250,000 SNPs has been designed based on these results; we demonstrate that it should have more than adequate coverage for genome-wide association mapping. The extent of LD is highly variable, and we find clear evidence of recombination hotspots, which seem to occur preferentially in intergenic regions. LD also reflects the action of selection, and it is more extensive between nonsynonymous polymorphisms than between synonymous polymorphisms.  相似文献   

7.
8.
Single-nucleotide polymorphisms (SNPs) have been explored as a high-resolution marker set for accelerating the mapping of disease genes. Here we report 48,196 candidate SNPs detected by statistical analysis of human expressed sequence tags (ESTs), associated primarily with coding regions of genes. We used Bayesian inference to weigh evidence for true polymorphism versus sequencing error, misalignment or ambiguity, misclustering or chimaeric EST sequences, assessing data such as raw chromatogram height, sharpness, overlap and spacing, sequencing error rates, context-sensitivity and cDNA library origin. Three separate validations-comparison with 54 genes screened for SNPs independently, verification of HLA-A polymorphisms and restriction fragment length polymorphism (RFLP) testing-verified 70%, 89% and 71% of our predicted SNPs, respectively. Our method detects tenfold more true HLA-A SNPs than previous analyses of the EST data. We found SNPs in a large fraction of known disease genes, including some disease-causing mutations (for example, the HbS sickle-cell mutation). Our comprehensive analysis of human coding region polymorphism provides a public resource for mapping of disease genes (available at http://www.bioinformatics.ucla.edu/snp).  相似文献   

9.
The nematode Caenorhabditis elegans is central to research in molecular, cell and developmental biology, but nearly all of this research has been conducted on a single strain of C. elegans. Little is known about the population genomic and evolutionary history of this species. We characterized C. elegans genetic variation using high-throughput selective sequencing of a worldwide collection of 200 wild strains and identified 41,188 SNPs. Notably, C. elegans genome variation is dominated by a set of commonly shared haplotypes on four of its six chromosomes, each spanning many megabases. Population genetic modeling showed that this pattern was generated by chromosome-scale selective sweeps that have reduced variation worldwide; at least one of these sweeps probably occurred in the last few hundred years. These sweeps, which we hypothesize to be a result of human activity, have drastically reshaped the global C. elegans population in the recent past.  相似文献   

10.
Numerous types of DNA variation exist, ranging from SNPs to larger structural alterations such as copy number variants (CNVs) and inversions. Alignment of DNA sequence from different sources has been used to identify SNPs and intermediate-sized variants (ISVs). However, only a small proportion of total heterogeneity is characterized, and little is known of the characteristics of most smaller-sized (<50 kb) variants. Here we show that genome assembly comparison is a robust approach for identification of all classes of genetic variation. Through comparison of two human assemblies (Celera's R27c compilation and the Build 35 reference sequence), we identified megabases of sequence (in the form of 13,534 putative non-SNP events) that were absent, inverted or polymorphic in one assembly. Database comparison and laboratory experimentation further demonstrated overlap or validation for 240 variable regions and confirmed >1.5 million SNPs. Some differences were simple insertions and deletions, but in regions containing CNVs, segmental duplication and repetitive DNA, they were more complex. Our results uncover substantial undescribed variation in humans, highlighting the need for comprehensive annotation strategies to fully interpret genome scanning and personalized sequencing projects.  相似文献   

11.
Characterizing genetic diversity within and between populations has broad applications in studies of human disease and evolution. We propose a new approach, spatial ancestry analysis, for the modeling of genotypes in two- or three-dimensional space. In spatial ancestry analysis (SPA), we explicitly model the spatial distribution of each SNP by assigning an allele frequency as a continuous function in geographic space. We show that the explicit modeling of the allele frequency allows individuals to be localized on the map on the basis of their genetic information alone. We apply our SPA method to a European and a worldwide population genetic variation data set and identify SNPs showing large gradients in allele frequency, and we suggest these as candidate regions under selection. These regions include SNPs in the well-characterized LCT region, as well as at loci including FOXP2, OCA2 and LRP1B.  相似文献   

12.
Interindividual variability in drug response, ranging from no therapeutic benefit to life-threatening adverse reactions, is influenced by variation in genes that control the absorption, distribution, metabolism and excretion of drugs. We genotyped 904 single-nucleotide polymorphisms (SNPs) from 55 such genes in two population samples (European and Japanese) and identified a set of tagging SNPs that represents the common variation in these genes, both known and unknown. Extensive empirical evaluations, including a direct assessment of association with candidate functional SNPs in a new, larger population sample, validated the performance of these tagging SNPs and confirmed their utility for linkage-disequilibrium mapping in pharmacogenetics. The analyses also suggest that rare variation is not amenable to tagging strategies.  相似文献   

13.
Schizophrenia is a complex disorder caused by both genetic and environmental factors. Using 9,087 affected individuals, 12,171 controls and 915,354 imputed SNPs from the Schizophrenia Psychiatric Genome-Wide Association Study (GWAS) Consortium (PGC-SCZ), we estimate that 23% (s.e. = 1%) of variation in liability to schizophrenia is captured by SNPs. We show that a substantial proportion of this variation must be the result of common causal variants, that the variance explained by each chromosome is linearly related to its length (r = 0.89, P = 2.6 × 10(-8)), that the genetic basis of schizophrenia is the same in males and females, and that a disproportionate proportion of variation is attributable to a set of 2,725 genes expressed in the central nervous system (CNS; P = 7.6 × 10(-8)). These results are consistent with a polygenic genetic architecture and imply more individual SNP associations will be detected for this disease as sample size increases.  相似文献   

14.
Linkage disequilibrium (LD) mapping provides a powerful method for fine-structure localization of rare disease genes, but has not yet been widely applied to common disease. We sought to design a systematic approach for LD mapping and apply it to the localization of a gene (IBD5) conferring susceptibility to Crohn disease. The key issues are: (i) to detect a significant LD signal (ii) to rigorously bound the critical region and (iii) to identify the causal genetic variant within this region. We previously mapped the IBD5 locus to a large region spanning 18 cM of chromosome 5q31 (P<10(-4)). Using dense genetic maps of microsatellite markers and single-nucleotide polymorphisms (SNPs) across the entire region, we found strong evidence of LD. We bound the region to a common haplotype spanning 250 kb that shows strong association with the disease (P< 2 x 10(-7)) and contains the cytokine gene cluster. This finding provides overwhelming evidence that a specific common haplotype of the cytokine region in 5q31 confers susceptibility to Crohn disease. However, genetic evidence alone is not sufficient to identify the causal mutation within this region, as strong LD across the region results in multiple SNPs having equivalent genetic evidence-each consistent with the expected properties of the IBD5 locus. These results have important implications for Crohn disease in particular and LD mapping in general.  相似文献   

15.
We estimate and partition genetic variation for height, body mass index (BMI), von Willebrand factor and QT interval (QTi) using 586,898 SNPs genotyped on 11,586 unrelated individuals. We estimate that ~45%, ~17%, ~25% and ~21% of the variance in height, BMI, von Willebrand factor and QTi, respectively, can be explained by all autosomal SNPs and a further ~0.5-1% can be explained by X chromosome SNPs. We show that the variance explained by each chromosome is proportional to its length, and that SNPs in or near genes explain more variation than SNPs between genes. We propose a new approach to estimate variation due to cryptic relatedness and population stratification. Our results provide further evidence that a substantial proportion of heritability is captured by common SNPs, that height, BMI and QTi are highly polygenic traits, and that the additive variation explained by a part of the genome is approximately proportional to the total length of DNA contained within genes therein.  相似文献   

16.
To identify risk variants for lung cancer, we conducted a multistage genome-wide association study. In the discovery phase, we analyzed 315,450 tagging SNPs in 1,154 current and former (ever) smoking cases of European ancestry and 1,137 frequency-matched, ever-smoking controls from Houston, Texas. For replication, we evaluated the ten SNPs most significantly associated with lung cancer in an additional 711 cases and 632 controls from Texas and 2,013 cases and 3,062 controls from the UK. Two SNPs, rs1051730 and rs8034191, mapping to a region of strong linkage disequilibrium within 15q25.1 containing PSMA4 and the nicotinic acetylcholine receptor subunit genes CHRNA3 and CHRNA5, were significantly associated with risk in both replication sets. Combined analysis yielded odds ratios of 1.32 (P < 1 x 10(-17)) for both SNPs. Haplotype analysis was consistent with there being a single risk variant in this region. We conclude that variation in a region of 15q25.1 containing nicotinic acetylcholine receptors genes contributes to lung cancer risk.  相似文献   

17.
A major goal in human genetics is to understand the role of common genetic variants in susceptibility to common diseases. This will require characterizing the nature of gene variation in human populations, assembling an extensive catalogue of single-nucleotide polymorphisms (SNPs) in candidate genes and performing association studies for particular diseases. At present, our knowledge of human gene variation remains rudimentary. Here we describe a systematic survey of SNPs in the coding regions of human genes. We identified SNPs in 106 genes relevant to cardiovascular disease, endocrinology and neuropsychiatry by screening an average of 114 independent alleles using 2 independent screening methods. To ensure high accuracy, all reported SNPs were confirmed by DNA sequencing. We identified 560 SNPs, including 392 coding-region SNPs (cSNPs) divided roughly equally between those causing synonymous and non-synonymous changes. We observed different rates of polymorphism among classes of sites within genes (non-coding, degenerate and non-degenerate) as well as between genes. The cSNPs most likely to influence disease, those that alter the amino acid sequence of the encoded protein, are found at a lower rate and with lower allele frequencies than silent substitutions. This likely reflects selection acting against deleterious alleles during human evolution. The lower allele frequency of missense cSNPs has implications for the compilation of a comprehensive catalogue, as well as for the subsequent application to disease association.  相似文献   

18.
Humans show great variation in phenotypic traits such as height, eye color and susceptibility to disease. Genomic DNA sequence differences among individuals are responsible for the inherited components of these complex traits. Reports suggest that intermediate and large-scale DNA copy number and structural variations are prevalent enough to be an important source of genetic variation between individuals. Because association studies to identify genomic loci associated with particular phenotypic traits have focused primarily on genotyping SNPs, it is important to determine whether common structural polymorphisms are in linkage disequilibrium with common SNPs, and thus can be assessed indirectly in SNP-based studies. Here we examine 100 deletion polymorphisms ranging from 70 bp to 7 kb. We show that common deletions and SNPs ascertained with similar criteria have essentially the same distribution of linkage disequilibrium with surrounding SNPs, indicating that these polymorphisms may share evolutionary history and that most deletion polymorphisms are effectively assayed by proxy in SNP-based association studies.  相似文献   

19.
Genome-wide patterns of genetic variation among elite maize inbred lines   总被引:6,自引:0,他引:6  
Lai J  Li R  Xu X  Jin W  Xu M  Zhao H  Xiang Z  Song W  Ying K  Zhang M  Jiao Y  Ni P  Zhang J  Li D  Guo X  Ye K  Jian M  Wang B  Zheng H  Liang H  Zhang X  Wang S  Chen S  Li J  Fu Y  Springer NM  Yang H  Wang J  Dai J  Schnable PS  Wang J 《Nature genetics》2010,42(11):1027-1030
We have resequenced a group of six elite maize inbred lines, including the parents of the most productive commercial hybrid in China. This effort uncovered more than 1,000,000 SNPs, 30,000 indel polymorphisms and 101 low-sequence-diversity chromosomal intervals in the maize genome. We also identified several hundred complete genes that show presence/absence variation among these resequenced lines. We discuss the potential roles of complementation of presence/absence variations and other deleterious mutations in contributing to heterosis. High-density SNP and indel polymorphism markers reported here are expected to be a valuable resource for future genetic studies and the molecular breeding of this important crop.  相似文献   

20.
Genome-wide mapping with biallelic markers in Arabidopsis thaliana.   总被引:17,自引:0,他引:17  
Single-nucleotide polymorphisms, as well as small insertions and deletions (here referred to collectively as simple nucleotide polymorphisms, or SNPs), comprise the largest set of sequence variants in most organisms. Positional cloning based on SNPs may accelerate the identification of human disease traits and a range of biologically informative mutations. The recent application of high-density oligonucleotide arrays to allele identification has made it feasible to genotype thousands of biallelic SNPs in a single experiment. It has yet to be established, however, whether SNP detection using oligonucleotide arrays can be used to accelerate the mapping of traits in diploid genomes. The cruciferous weed Arabidopsis thaliana is an attractive model system for the construction and use of biallelic SNP maps. Although important biological processes ranging from fertilization and cell fate determination to disease resistance have been modelled in A. thaliana, identifying mutations in this organism has been impeded by the lack of a high-density genetic map consisting of easily genotyped DNA markers. We report here the construction of a biallelic genetic map in A. thaliana with a resolution of 3.5 cM and its use in mapping Eds16, a gene involved in the defence response to the fungal pathogen Erysiphe orontii. Mapping of this trait involved the high-throughput generation of meiotic maps of F2 individuals using high-density oligonucleotide probe array-based genotyping. We developed a software package called InterMap and used it to automatically delimit Eds16 to a 7-cM interval on chromosome 1. These results are the first demonstration of biallelic mapping in diploid genomes and establish means for generalizing SNP-based maps to virtually any genetic organism.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号