首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 33 毫秒
1.
Genome-wide association studies are set to become the method of choice for uncovering the genetic basis of human diseases. A central challenge in this area is the development of powerful multipoint methods that can detect causal variants that have not been directly genotyped. We propose a coherent analysis framework that treats the problem as one involving missing or uncertain genotypes. Central to our approach is a model-based imputation method for inferring genotypes at observed or unobserved SNPs, leading to improved power over existing methods for multipoint association mapping. Using real genome-wide association study data, we show that our approach (i) is accurate and well calibrated, (ii) provides detailed views of associated regions that facilitate follow-up studies and (iii) can be used to validate and correct data at genotyped markers. A notable future use of our method will be to boost power by combining data from genome-wide scans that use different SNP sets.  相似文献   

2.
Efficiency and power in genetic association studies   总被引:30,自引:0,他引:30  
We investigated selection and analysis of tag SNPs for genome-wide association studies by specifically examining the relationship between investment in genotyping and statistical power. Do pairwise or multimarker methods maximize efficiency and power? To what extent is power compromised when tags are selected from an incomplete resource such as HapMap? We addressed these questions using genotype data from the HapMap ENCODE project, association studies simulated under a realistic disease model, and empirical correction for multiple hypothesis testing. We demonstrate a haplotype-based tagging method that uniformly outperforms single-marker tests and methods for prioritization that markedly increase tagging efficiency. Examining all observed haplotypes for association, rather than just those that are proxies for known SNPs, increases power to detect rare causal alleles, at the cost of reduced power to detect common causal alleles. Power is robust to the completeness of the reference panel from which tags are selected. These findings have implications for prioritizing tag SNPs and interpreting association studies.  相似文献   

3.
A general question for linkage disequilibrium-based association studies is how power to detect an association is compromised when tag SNPs are chosen from data in one population sample and then deployed in another sample. Specifically, it is important to know how well tags picked from the HapMap DNA samples capture the variation in other samples. To address this, we collected dense data uniformly across the four HapMap population samples and eleven other population samples. We picked tag SNPs using genotype data we collected in the HapMap samples and then evaluated the effective coverage of these tags in comparison to the entire set of common variants observed in the other samples. We simulated case-control association studies in the non-HapMap samples under a disease model of modest risk, and we observed little loss in power. These results demonstrate that the HapMap DNA samples can be used to select tags for genome-wide association studies in many samples around the world.  相似文献   

4.
Population structure causes genome-wide linkage disequilibrium between unlinked loci, leading to statistical confounding in genome-wide association studies. Mixed models have been shown to handle the confounding effects of a diffuse background of large numbers of loci of small effect well, but they do not always account for loci of larger effect. Here we propose a multi-locus mixed model as a general method for mapping complex traits in structured populations. Simulations suggest that our method outperforms existing methods in terms of power as well as false discovery rate. We apply our method to human and Arabidopsis thaliana data, identifying new associations and evidence for allelic heterogeneity. We also show how a priori knowledge from an A. thaliana linkage mapping study can be integrated into our method using a Bayesian approach. Our implementation is computationally efficient, making the analysis of large data sets (n > 10,000) practicable.  相似文献   

5.
Maize is both an exciting model organism in plant genetics and also the most important crop worldwide for food, animal feed and bioenergy production. Recent genome-wide association and metabolic profiling studies aimed to resolve quantitative traits to their causal genetic loci and key metabolic regulators. Here we present a complementary approach that exploits large-scale genomic and metabolic information to predict complex, highly polygenic traits in hybrid testcrosses. We crossed 285 diverse Dent inbred lines from worldwide sources with two testers and predicted their combining abilities for seven biomass- and bioenergy-related traits using 56,110 SNPs and 130 metabolites. Whole-genome and metabolic prediction models were built by fitting effects for all SNPs or metabolites. Prediction accuracies ranged from 0.72 to 0.81 for SNPs and from 0.60 to 0.80 for metabolites, allowing a reliable screening of large collections of diverse inbred lines for their potential to create superior hybrids.  相似文献   

6.
We present an approximate conditional and joint association analysis that can use summary-level statistics from a meta-analysis of genome-wide association studies (GWAS) and estimated linkage disequilibrium (LD) from a reference sample with individual-level genotype data. Using this method, we analyzed meta-analysis summary data from the GIANT Consortium for height and body mass index (BMI), with the LD structure estimated from genotype data in two independent cohorts. We identified 36 loci with multiple associated variants for height (38 leading and 49 additional SNPs, 87 in total) via a genome-wide SNP selection procedure. The 49 new SNPs explain approximately 1.3% of variance, nearly doubling the heritability explained at the 36 loci. We did not find any locus showing multiple associated SNPs for BMI. The method we present is computationally fast and is also applicable to case-control data, which we demonstrate in an example from meta-analysis of type 2 diabetes by the DIAGRAM Consortium.  相似文献   

7.
We conducted a genome-wide association study (GWAS) of breast cancer by genotyping 528,173 SNPs in 1,145 postmenopausal women of European ancestry with invasive breast cancer and 1,142 controls. We identified four SNPs in intron 2 of FGFR2 (which encodes a receptor tyrosine kinase and is amplified or overexpressed in some breast cancers) that were highly associated with breast cancer and confirmed this association in 1,776 affected individuals and 2,072 controls from three additional studies. Across the four studies, the association with all four SNPs was highly statistically significant (P(trend) for the most strongly associated SNP (rs1219648) = 1.1 x 10(-10); population attributable risk = 16%). Four SNPs at other loci most strongly associated with breast cancer in the initial GWAS were not associated in the replication studies. Our summary results from the GWAS are available online in a form that should speed the identification of additional risk loci.  相似文献   

8.
9.
More than 5 million single-nucleotide polymorphisms (SNPs) with minor-allele frequency greater than 10% are expected to exist in the human genome. Some of these SNPs may be associated with risk of developing common diseases. To assess the power of currently available SNPs to detect such associations, we resequenced 50 genes in two ethnic samples and measured patterns of linkage disequilibrium between the subset of SNPs reported in dbSNP and the complete set of common SNPs. Our results suggest that using all 2.7 million SNPs currently in the database would detect nearly 80% of all common SNPs in European populations but only 50% of those common in the African American population and that efficient selection of a minimal subset of SNPs for use in association studies requires measurement of allele frequency and linkage disequilibrium relationships for all SNPs in dbSNP.  相似文献   

10.
We have previously reported multiple loci associated with prostate cancer susceptibility in a Japanese population using a genome-wide association study (GWAS). To identify additional prostate cancer susceptibility loci, we genotyped nine SNPs that were nominally associated with prostate cancer (P < 1 × 10(-4)) in our previous GWAS in three independent studies of prostate cancer in Japanese men (2,557 individuals with prostate cancer (cases) and 3,003 controls). In a meta-analysis of our previous GWAS and the replication studies, which included a total of 7,141 prostate cancer cases and 11,804 controls from a single ancestry group, three new loci reached genome-wide significance on chromosomes 11q12 (rs1938781; P = 1.10 × 10(-10); FAM111A-FAM111B), 10q26 (rs2252004; P = 1.98 × 10(-8)) and 3p11.2 (rs2055109; P = 3.94 × 10(-8)). We also found suggestive evidence of association at a previously reported prostate cancer susceptibility locus at 2p11 (rs2028898; P = 1.08 × 10(-7)). The identification of three new susceptibility loci should provide additional insight into the pathogenesis of prostate cancer and emphasizes the importance of conducting GWAS in diverse populations.  相似文献   

11.
Hair, skin and eye colors are highly heritable and visible traits in humans. We carried out a genome-wide association scan for variants associated with hair and eye pigmentation, skin sensitivity to sun and freckling among 2,986 Icelanders. We then tested the most closely associated SNPs from six regions--four not previously implicated in the normal variation of human pigmentation--and replicated their association in a second sample of 2,718 Icelanders and a sample of 1,214 Dutch. The SNPs from all six regions met the criteria for genome-wide significance. A variant in SLC24A4 is associated with eye and hair color, a variant near KITLG is associated with hair color, two coding variants in TYR are associated with eye color and freckles, and a variant on 6p25.3 is associated with freckles. The fifth region provided refinements to a previously reported association in OCA2, and the sixth encompasses previously described variants in MC1R.  相似文献   

12.
Whole-genome association studies are predicted to be especially powerful in isolated populations owing to increased linkage disequilibrium (LD) and decreased allelic diversity, but this possibility has not been empirically tested. We compared genome-wide data on 113,240 SNPs typed on 30 trios from the Pacific island of Kosrae to the same markers typed in the 270 samples from the International HapMap Project. The extent of LD is longer and haplotype diversity is lower in Kosrae than in the HapMap populations. More than 98% of Kosraen haplotypes are present in HapMap populations, indicating that HapMap will be useful for genetic studies on Kosrae. The long-range LD around common alleles and limited diversity result in improved efficiency in genetic studies in this population and augments the power to detect association of 'hidden SNPs'.  相似文献   

13.
We conducted a combined genome-wide association study (GWAS) of 7,481 individuals with bipolar disorder (cases) and 9,250 controls as part of the Psychiatric GWAS Consortium. Our replication study tested 34 SNPs in 4,496 independent cases with bipolar disorder and 42,422 independent controls and found that 18 of 34 SNPs had P < 0.05, with 31 of 34 SNPs having signals with the same direction of effect (P = 3.8 × 10(-7)). An analysis of all 11,974 bipolar disorder cases and 51,792 controls confirmed genome-wide significant evidence of association for CACNA1C and identified a new intronic variant in ODZ4. We identified a pathway comprised of subunits of calcium channels enriched in bipolar disorder association intervals. Finally, a combined GWAS analysis of schizophrenia and bipolar disorder yielded strong association evidence for SNPs in CACNA1C and in the region of NEK4-ITIH1-ITIH3-ITIH4. Our replication results imply that increasing sample sizes in bipolar disorder will confirm many additional loci.  相似文献   

14.
Schizophrenia is a severe mental disorder affecting ~1% of the world population, with heritability of up to 80%. To identify new common genetic risk factors, we performed a genome-wide association study (GWAS) in the Han Chinese population. The discovery sample set consisted of 3,750 individuals with schizophrenia and 6,468 healthy controls (1,578 cases and 1,592 controls from northern Han Chinese, 1,238 cases and 2,856 controls from central Han Chinese, and 934 cases and 2,020 controls from the southern Han Chinese). We further analyzed the strongest association signals in an additional independent cohort of 4,383 cases and 4,539 controls from the Han Chinese population. Meta-analysis identified common SNPs that associated with schizophrenia with genome-wide significance on 8p12 (rs16887244, P = 1.27 × 10(-10)) and 1q24.2 (rs10489202, P = 9.50 × 10(-9)). Our findings provide new insights into the pathogenesis of schizophrenia.  相似文献   

15.
Genome-wide association is a promising approach to identify common genetic variants that predispose to human disease. Because of the high cost of genotyping hundreds of thousands of markers on thousands of subjects, genome-wide association studies often follow a staged design in which a proportion (pi(samples)) of the available samples are genotyped on a large number of markers in stage 1, and a proportion (pi(samples)) of these markers are later followed up by genotyping them on the remaining samples in stage 2. The standard strategy for analyzing such two-stage data is to view stage 2 as a replication study and focus on findings that reach statistical significance when stage 2 data are considered alone. We demonstrate that the alternative strategy of jointly analyzing the data from both stages almost always results in increased power to detect genetic association, despite the need to use more stringent significance levels, even when effect sizes differ between the two stages. We recommend joint analysis for all two-stage genome-wide association studies, especially when a relatively large proportion of the samples are genotyped in stage 1 (pi(samples) >or= 0.30), and a relatively large proportion of markers are selected for follow-up in stage 2 (pi(markers) >or= 0.01).  相似文献   

16.
Genome-wide association studies involving hundreds of thousands of SNPs in thousands of cases and controls are now underway. The first of many analytical challenges in these studies involves the choice of SNPs to genotype. It is not practical to construct a different panel of tag SNPs for each study, so the first generation of genome-wide scans will use predefined, commercially available marker panels, which will in part dictate their success or failure. We compare different approaches in use today, and show that although many of them provide substantial coverage of common variation in non-African populations, the precise extent is strongly dependent on the frequencies of alleles of interest and on specific considerations of study design. Overall, despite substantial differences in genotyping technologies, marker selection strategies and number of markers assayed, the first-generation high-throughput platforms all offer similar levels of genome coverage.  相似文献   

17.
Multiple genetic loci associated with obesity or body mass index (BMI) have been identified through genome-wide association studies conducted predominantly in populations of European ancestry. We performed a meta-analysis of associations between BMI and approximately 2.4 million SNPs in 27,715 east Asians, which was followed by in silico and de novo replication studies in 37,691 and 17,642 additional east Asians, respectively. We identified ten BMI-associated loci at genome-wide significance (P < 5.0 × 10(-8)), including seven previously identified loci (FTO, SEC16B, MC4R, GIPR-QPCTL, ADCY3-DNAJC27, BDNF and MAP2K5) and three novel loci in or near the CDKAL1, PCSK1 and GP2 genes. Three additional loci nearly reached the genome-wide significance threshold, including two previously identified loci in the GNPDA2 and TFAP2B genes and a newly identified signal near PAX6, all of which were associated with BMI with P < 5.0 × 10(-7). Findings from this study may shed light on new pathways involved in obesity and demonstrate the value of conducting genetic studies in non-European populations.  相似文献   

18.
We carried out a genome-wide association study of type-2 diabetes (T2D) in individuals of South Asian ancestry. Our discovery set included 5,561 individuals with T2D (cases) and 14,458 controls drawn from studies in London, Pakistan and Singapore. We identified 20 independent SNPs associated with T2D at P < 10(-4) for testing in a replication sample of 13,170 cases and 25,398 controls, also all of South Asian ancestry. In the combined analysis, we identified common genetic variants at six loci (GRB14, ST6GAL1, VPS26A, HMG20A, AP3S2 and HNF4A) newly associated with T2D (P = 4.1 × 10(-8) to P = 1.9 × 10(-11)). SNPs at GRB14 were also associated with insulin sensitivity (P = 5.0 × 10(-4)), and SNPs at ST6GAL1 and HNF4A were also associated with pancreatic beta-cell function (P = 0.02 and P = 0.001, respectively). Our findings provide additional insight into mechanisms underlying T2D and show the potential for new discovery from genetic association studies in South Asians, a population with increased susceptibility to T2D.  相似文献   

19.
Recombination and linkage disequilibrium in Arabidopsis thaliana   总被引:4,自引:0,他引:4  
Linkage disequilibrium (LD) is a major aspect of the organization of genetic variation in natural populations. Here we describe the genome-wide pattern of LD in a sample of 19 Arabidopsis thaliana accessions using 341,602 non-singleton SNPs. LD decays within 10 kb on average, considerably faster than previously estimated. Tag SNP selection algorithms and 'hide-the-SNP' simulations suggest that genome-wide association mapping will require only 40%-50% of the observed SNPs, a reduction similar to estimates in a sample of African Americans. An Affymetrix genotyping array containing 250,000 SNPs has been designed based on these results; we demonstrate that it should have more than adequate coverage for genome-wide association mapping. The extent of LD is highly variable, and we find clear evidence of recombination hotspots, which seem to occur preferentially in intergenic regions. LD also reflects the action of selection, and it is more extensive between nonsynonymous polymorphisms than between synonymous polymorphisms.  相似文献   

20.
Emerging technologies make it possible for the first time to genotype hundreds of thousands of SNPs simultaneously, enabling whole-genome association studies. Using empirical genotype data from the International HapMap Project, we evaluate the extent to which the sets of SNPs contained on three whole-genome genotyping arrays capture common SNPs across the genome, and we find that the majority of common SNPs are well captured by these products either directly or through linkage disequilibrium. We explore analytical strategies that use HapMap data to improve power of association studies conducted with these fixed sets of markers and show that limited inclusion of specific haplotype tests in association analysis can increase the fraction of common variants captured by 25-100%. Finally, we introduce a Bayesian approach to association analysis by weighting the likelihood of each statistical test to reflect the number of putative causal alleles to which it is correlated.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号