首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The genetic architectures of common, complex diseases are largely uncharacterized. We modeled the genetic architecture underlying genome-wide association study (GWAS) data for rheumatoid arthritis and developed a new method using polygenic risk-score analyses to infer the total liability-scale variance explained by associated GWAS SNPs. Using this method, we estimated that, together, thousands of SNPs from rheumatoid arthritis GWAS explain an additional 20% of disease risk (excluding known associated loci). We further tested this method on datasets for three additional diseases and obtained comparable estimates for celiac disease (43% excluding the major histocompatibility complex), myocardial infarction and coronary artery disease (48%) and type 2 diabetes (49%). Our results are consistent with simulated genetic models in which hundreds of associated loci harbor common causal variants and a smaller number of loci harbor multiple rare causal variants. These analyses suggest that GWAS will continue to be highly productive for the discovery of additional susceptibility loci for common diseases.  相似文献   

2.
Schizophrenia is a complex disorder caused by both genetic and environmental factors. Using 9,087 affected individuals, 12,171 controls and 915,354 imputed SNPs from the Schizophrenia Psychiatric Genome-Wide Association Study (GWAS) Consortium (PGC-SCZ), we estimate that 23% (s.e. = 1%) of variation in liability to schizophrenia is captured by SNPs. We show that a substantial proportion of this variation must be the result of common causal variants, that the variance explained by each chromosome is linearly related to its length (r = 0.89, P = 2.6 × 10(-8)), that the genetic basis of schizophrenia is the same in males and females, and that a disproportionate proportion of variation is attributable to a set of 2,725 genes expressed in the central nervous system (CNS; P = 7.6 × 10(-8)). These results are consistent with a polygenic genetic architecture and imply more individual SNP associations will be detected for this disease as sample size increases.  相似文献   

3.
Noncoding variants at human chromosome 9p21 near CDKN2A and CDKN2B are associated with type 2 diabetes, myocardial infarction, aneurysm, vertical cup disc ratio and at least five cancers. Here we compare approaches to more comprehensively assess genetic variation in the region. We carried out targeted sequencing at high coverage in 47 individuals and compared the results to pilot data from the 1000 Genomes Project. We imputed variants into type 2 diabetes and myocardial infarction cohorts directly from targeted sequencing, from a genotyped reference panel derived from sequencing and from 1000 Genomes Project low-coverage data. Polymorphisms with frequency >5% were captured well by all strategies. Imputation of intermediate-frequency polymorphisms required a higher density of tag SNPs in disease samples than is available on first-generation genome-wide association study (GWAS) arrays. Our association analyses identified more comprehensive sets of variants showing equivalent statistical association with type 2 diabetes or myocardial infarction, but did not identify stronger associations than the original GWAS signals.  相似文献   

4.
Huang X  Zhao Y  Wei X  Li C  Wang A  Zhao Q  Li W  Guo Y  Deng L  Zhu C  Fan D  Lu Y  Weng Q  Liu K  Zhou T  Jing Y  Si L  Dong G  Huang T  Lu T  Feng Q  Qian Q  Li J  Han B 《Nature genetics》2012,44(1):32-39
A high-density haplotype map recently enabled a genome-wide association study (GWAS) in a population of indica subspecies of Chinese rice landraces. Here we extend this methodology to a larger and more diverse sample of 950 worldwide rice varieties, including the Oryza sativa indica and Oryza sativa japonica subspecies, to perform an additional GWAS. We identified a total of 32 new loci associated with flowering time and with ten grain-related traits, indicating that the larger sample increased the power to detect trait-associated variants using GWAS. To characterize various alleles and complex genetic variation, we developed an analytical framework for haplotype-based de novo assembly of the low-coverage sequencing data in rice. We identified candidate genes for 18 associated loci through detailed annotation. This study shows that the integrated approach of sequence-based GWAS and functional genome annotation has the potential to match complex traits to their causal polymorphisms in rice.  相似文献   

5.
Genome-wide association studies (GWAS) have identified dozens of risk loci for many complex disorders, including Crohn's disease. However, common disease-associated SNPs explain at most ~20% of the genetic variance for Crohn's disease. Several factors may account for this unexplained heritability, including rare risk variants not adequately tagged thus far in GWAS. That rare susceptibility variants indeed contribute to variation in multifactorial phenotypes has been demonstrated for colorectal cancer, plasma high-density lipoprotein cholesterol levels, blood pressure, type 1 diabetes, hypertriglyceridemia and, in the case of Crohn's disease, for NOD2 (refs. 14,15). Here we describe the use of high-throughput resequencing of DNA pools to search for rare coding variants influencing susceptibility to Crohn's disease in 63 GWAS-identified positional candidate genes. We identify low frequency coding variants conferring protection against inflammatory bowel disease in IL23R, but we conclude that rare coding variants in positional candidates do not make a large contribution to inherited predisposition to Crohn's disease.  相似文献   

6.
More than 1,000 susceptibility loci have been identified through genome-wide association studies (GWAS) of common variants; however, the specific genes and full allelic spectrum of causal variants underlying these findings have not yet been defined. Here we used pooled next-generation sequencing to study 56 genes from regions associated with Crohn's disease in 350 cases and 350 controls. Through follow-up genotyping of 70 rare and low-frequency protein-altering variants in nine independent case-control series (16,054 Crohn's disease cases, 12,153 ulcerative colitis cases and 17,575 healthy controls), we identified four additional independent risk factors in NOD2, two additional protective variants in IL23R, a highly significant association with a protective splice variant in CARD9 (P < 1 × 10(-16), odds ratio ≈ 0.29) and additional associations with coding variants in IL18RAP, CUL2, C1orf106, PTPN22 and MUC19. We extend the results of successful GWAS by identifying new, rare and probably functional variants that could aid functional experiments and predictive models.  相似文献   

7.
Genome-wide association studies (GWAS) have proven to be a powerful method to identify common genetic variants contributing to susceptibility to common diseases. Here, we show that extremely low-coverage sequencing (0.1-0.5×) captures almost as much of the common (>5%) and low-frequency (1-5%) variation across the genome as SNP arrays. As an empirical demonstration, we show that genome-wide SNP genotypes can be inferred at a mean r(2) of 0.71 using off-target data (0.24× average coverage) in a whole-exome study of 909 samples. Using both simulated and real exome-sequencing data sets, we show that association statistics obtained using extremely low-coverage sequencing data attain similar P values at known associated variants as data from genotyping arrays, without an excess of false positives. Within the context of reductions in sample preparation and sequencing costs, funds invested in extremely low-coverage sequencing can yield several times the effective sample size of GWAS based on SNP array data and a commensurate increase in statistical power.  相似文献   

8.
Nuclear magnetic resonance assays allow for measurement of a wide range of metabolic phenotypes. We report here the results of a GWAS on 8,330 Finnish individuals genotyped and imputed at 7.7 million SNPs for a range of 216 serum metabolic phenotypes assessed by NMR of serum samples. We identified significant associations (P < 2.31 × 10(-10)) at 31 loci, including 11 for which there have not been previous reports of associations to a metabolic trait or disorder. Analyses of Finnish twin pairs suggested that the metabolic measures reported here show higher heritability than comparable conventional metabolic phenotypes. In accordance with our expectations, SNPs at the 31 loci associated with individual metabolites account for a greater proportion of the genetic component of trait variance (up to 40%) than is typically observed for conventional serum metabolic phenotypes. The identification of such associations may provide substantial insight into cardiometabolic disorders.  相似文献   

9.
Population stratification refers to differences in allele frequencies between cases and controls due to systematic differences in ancestry rather than association of genes with disease. It has been proposed that false positive associations due to stratification can be controlled by genotyping a few dozen unlinked genetic markers. To assess stratification empirically, we analyzed data from 11 case-control and case-cohort association studies. We did not detect statistically significant evidence for stratification but did observe that assessments based on a few dozen markers lack power to rule out moderate levels of stratification that could cause false positive associations in studies designed to detect modest genetic risk factors. After increasing the number of markers and samples in a case-cohort study (the design most immune to stratification), we found that stratification was in fact present. Our results suggest that modest amounts of stratification can exist even in well designed studies.  相似文献   

10.
Association studies offer a potentially powerful approach to identify genetic variants that influence susceptibility to common disease, but are plagued by the impression that they are not consistently reproducible. In principle, the inconsistency may be due to false positive studies, false negative studies or true variability in association among different populations. The critical question is whether false positives overwhelmingly explain the inconsistency. We analyzed 301 published studies covering 25 different reported associations. There was a large excess of studies replicating the first positive reports, inconsistent with the hypothesis of no true positive associations (P < 10(-14)). This excess of replications could not be reasonably explained by publication bias and was concentrated among 11 of the 25 associations. For 8 of these 11 associations, pooled analysis of follow-up studies yielded statistically significant replication of the first report, with modest estimated genetic effects. Thus, a sizable fraction (but under half) of reported associations have strong evidence of replication; for these, false negative, underpowered studies probably contribute to inconsistent replication. We conclude that there are probably many common variants in the human genome with modest but real effects on common disease risk, and that studies using large samples will convincingly identify such variants.  相似文献   

11.
Well-powered genome-wide association studies, now made possible through advances in technology and large-scale collaborative projects, promise to characterize the contribution of rare variants to complex traits and disease. However, while population structure is a known confounder of association studies, it remains unknown whether methods developed to control stratification are equally effective for rare variants. Here, we demonstrate that rare variants can show a stratification that is systematically different from, and typically stronger than, common variants, and this is not necessarily corrected by existing methods. We show that the same process leads to inflation for load-based tests and can obscure signals at truly associated variants. Furthermore, we show that populations can display spatial structure in rare variants, even when Wright's fixation index F(ST) is low, but that allele frequency-dependent metrics of allele sharing can reveal localized stratification. These results underscore the importance of collecting and integrating spatial information in the genetic analysis of complex traits.  相似文献   

12.
The impact of population structure on association studies undertaken to identify genetic variants underlying common human diseases is an issue of growing interest. Spurious associations of alleles with disease phenotypes may be obtained or true associations overlooked when allele frequencies differ notably among subpopulations that are not represented equally among cases and controls. Population structure influences even carefully designed studies and can affect the validity of association results. Most study designs address this problem by sampling cases and controls from groups that share the same nationality or self-reported ethnic background, with the implicit assumption that no substructure exists within such groups. We examined population structure in the Icelandic gene pool using extensive genealogical and genetic data. Our results indicate that sampling strategies need to take account of substructure even in a relatively homogenous genetic isolate. This will probably be even more important in larger populations.  相似文献   

13.
Variation in DNA sequence contributes to individual differences in quantitative traits, but in humans the specific sequence variants are known for very few traits. We characterized variation in gene expression in cells from individuals belonging to three major population groups. This quantitative phenotype differs significantly between European-derived and Asian-derived populations for 1,097 of 4,197 genes tested. For the phenotypes with the strongest evidence of cis determinants, most of the variation is due to allele frequency differences at cis-linked regulators. The results show that specific genetic variation among populations contributes appreciably to differences in gene expression phenotypes. Populations differ in prevalence of many complex genetic diseases, such as diabetes and cardiovascular disease. As some of these are probably influenced by the level of gene expression, our results suggest that allele frequency differences at regulatory polymorphisms also account for some population differences in prevalence of complex diseases.  相似文献   

14.
Following up on recent genome-wide association studies (GWAS) of Crohn's disease, we investigated 50 previously reported susceptibility loci in a German sample of individuals with Crohn's disease (n = 1,850) or ulcerative colitis (n = 1,103) and healthy controls (n = 1,817). Among these loci, we identified variants in 3p21.31, NKX2-3 and CCNY as susceptibility factors for both diseases, whereas variants in PTPN2, HERC2 and STAT3 were associated only with ulcerative colitis in our sample collection.  相似文献   

15.
Efficiency and power in genetic association studies   总被引:30,自引:0,他引:30  
We investigated selection and analysis of tag SNPs for genome-wide association studies by specifically examining the relationship between investment in genotyping and statistical power. Do pairwise or multimarker methods maximize efficiency and power? To what extent is power compromised when tags are selected from an incomplete resource such as HapMap? We addressed these questions using genotype data from the HapMap ENCODE project, association studies simulated under a realistic disease model, and empirical correction for multiple hypothesis testing. We demonstrate a haplotype-based tagging method that uniformly outperforms single-marker tests and methods for prioritization that markedly increase tagging efficiency. Examining all observed haplotypes for association, rather than just those that are proxies for known SNPs, increases power to detect rare causal alleles, at the cost of reduced power to detect common causal alleles. Power is robust to the completeness of the reference panel from which tags are selected. These findings have implications for prioritizing tag SNPs and interpreting association studies.  相似文献   

16.
Most human sequence variation is in the form of single-nucleotide polymorphisms (SNPs). It has been proposed that coding-region SNPs (cSNPs) be used for direct association studies to determine the genetic basis of complex traits. The success of such studies depends on the frequency of disease-associated alleles, and their distribution in different ethnic populations. If disease-associated alleles are frequent in most populations, then direct genotyping of candidate variants could show robust associations in manageable study samples. This approach is less feasible if the genetic risk from a given candidate gene is due to many infrequent alleles. Previous studies of several genes demonstrated that most variants are relatively infrequent (<0.05). These surveys genotyped small samples (n<75) and thus had limited ability to identify rare alleles. Here we evaluate the prevalence and distribution of such rare alleles by genotyping an ethnically diverse reference sample that is more than six times larger than those used in previous studies (n=450). We screened for variants in the complete coding sequence and intron-exon junctions of two candidate genes for neuropsychiatric phenotypes: SLC6A4, encoding the serotonin transporter; and SLC18A2, encoding the vesicular monoamine transporter. Both genes have unique roles in neuronal transmission, and variants in either gene might be associated with neurobehavioral phenotypes.  相似文献   

17.
The effects of human population structure on large genetic association studies   总被引:21,自引:0,他引:21  
Large-scale association studies hold substantial promise for unraveling the genetic basis of common human diseases. A well-known problem with such studies is the presence of undetected population structure, which can lead to both false positive results and failures to detect genuine associations. Here we examine approximately 15,000 genome-wide single-nucleotide polymorphisms typed in three population groups to assess the consequences of population structure on the coming generation of association studies. The consequences of population structure on association outcomes increase markedly with sample size. For the size of study needed to detect typical genetic effects in common diseases, even the modest levels of population structure within population groups cannot safely be ignored. We also examine one method for correcting for population structure (Genomic Control). Although it often performs well, it may not correct for structure if too few loci are used and may overcorrect in other settings, leading to substantial loss of power. The results of our analysis can guide the design of large-scale association studies.  相似文献   

18.
The Human Genome Project and its spin-offs are making it increasingly feasible to determine the genetic basis of complex traits using genome-wide association studies. The statistical challenge of analyzing such studies stems from the severe multiple-comparison problem resulting from the analysis of thousands of SNPs. Our methodology for genome-wide family-based association studies, using single SNPs or haplotypes, can identify associations that achieve genome-wide significance. In relation to developing guidelines for our screening tools, we determined lower bounds for the estimated power to detect the gene underlying the disease-susceptibility locus, which hold regardless of the linkage disequilibrium structure present in the data. We also assessed the power of our approach in the presence of multiple disease-susceptibility loci. Our screening tools accommodate genomic control and use the concept of haplotype-tagging SNPs. Our methods use the entire sample and do not require separate screening and validation samples to establish genome-wide significance, as population-based designs do.  相似文献   

19.
Genome-wide association studies (GWAS) have identified ten loci harboring common variants that influence risk of developing colorectal cancer (CRC). To enhance the power to identify additional CRC risk loci, we conducted a meta-analysis of three GWAS from the UK which included a total of 3,334 affected individuals (cases) and 4,628 controls followed by multiple validation analyses including a total of 18,095 cases and 20,197 controls. We identified associations at four new CRC risk loci: 1q41 (rs6691170, odds ratio (OR) = 1.06, P = 9.55 × 10?1? and rs6687758, OR = 1.09, P = 2.27 × 10??, 3q26.2 (rs10936599, OR = 0.93, P = 3.39 × 10??), 12q13.13 (rs11169552, OR = 0.92, P = 1.89 × 10?1? and rs7136702, OR = 1.06, P = 4.02 × 10??) and 20q13.33 (rs4925386, OR = 0.93, P = 1.89 × 10?1?). In addition to identifying new CRC risk loci, this analysis provides evidence that additional CRC-associated variants of similar effect size remain to be discovered.  相似文献   

20.
Genome-wide association is a promising approach to identify common genetic variants that predispose to human disease. Because of the high cost of genotyping hundreds of thousands of markers on thousands of subjects, genome-wide association studies often follow a staged design in which a proportion (pi(samples)) of the available samples are genotyped on a large number of markers in stage 1, and a proportion (pi(samples)) of these markers are later followed up by genotyping them on the remaining samples in stage 2. The standard strategy for analyzing such two-stage data is to view stage 2 as a replication study and focus on findings that reach statistical significance when stage 2 data are considered alone. We demonstrate that the alternative strategy of jointly analyzing the data from both stages almost always results in increased power to detect genetic association, despite the need to use more stringent significance levels, even when effect sizes differ between the two stages. We recommend joint analysis for all two-stage genome-wide association studies, especially when a relatively large proportion of the samples are genotyped in stage 1 (pi(samples) >or= 0.30), and a relatively large proportion of markers are selected for follow-up in stage 2 (pi(markers) >or= 0.01).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号