首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Genome-wide association studies (GWAS) have proven to be a powerful method to identify common genetic variants contributing to susceptibility to common diseases. Here, we show that extremely low-coverage sequencing (0.1-0.5×) captures almost as much of the common (>5%) and low-frequency (1-5%) variation across the genome as SNP arrays. As an empirical demonstration, we show that genome-wide SNP genotypes can be inferred at a mean r(2) of 0.71 using off-target data (0.24× average coverage) in a whole-exome study of 909 samples. Using both simulated and real exome-sequencing data sets, we show that association statistics obtained using extremely low-coverage sequencing data attain similar P values at known associated variants as data from genotyping arrays, without an excess of false positives. Within the context of reductions in sample preparation and sequencing costs, funds invested in extremely low-coverage sequencing can yield several times the effective sample size of GWAS based on SNP array data and a commensurate increase in statistical power.  相似文献   

2.
The 1000 Genomes Project and disease-specific sequencing efforts are producing large collections of haplotypes that can be used as reference panels for genotype imputation in genome-wide association studies (GWAS). However, imputing from large reference panels with existing methods imposes a high computational burden. We introduce a strategy called 'pre-phasing' that maintains the accuracy of leading methods while reducing computational costs. We first statistically estimate the haplotypes for each individual within the GWAS sample (pre-phasing) and then impute missing genotypes into these estimated haplotypes. This reduces the computational cost because (i) the GWAS samples must be phased only once, whereas standard methods would implicitly repeat phasing with each reference panel update, and (ii) it is much faster to match a phased GWAS haplotype to one reference haplotype than to match two unphased GWAS genotypes to a pair of reference haplotypes. We implemented our approach in the MaCH and IMPUTE2 frameworks, and we tested it on data sets from the Wellcome Trust Case Control Consortium 2 (WTCCC2), the Genetic Association Information Network (GAIN), the Women's Health Initiative (WHI) and the 1000 Genomes Project. This strategy will be particularly valuable for repeated imputation as reference panels evolve.  相似文献   

3.
To identify loci for age at menarche, we performed a meta-analysis of 32 genome-wide association studies in 87,802 women of European descent, with replication in up to 14,731 women. In addition to the known loci at LIN28B (P = 5.4 × 10???) and 9q31.2 (P = 2.2 × 10?33), we identified 30 new menarche loci (all P < 5 × 10??) and found suggestive evidence for a further 10 loci (P < 1.9 × 10??). The new loci included four previously associated with body mass index (in or near FTO, SEC16B, TRA2B and TMEM18), three in or near other genes implicated in energy homeostasis (BSX, CRTC1 and MCHR2) and three in or near genes implicated in hormonal regulation (INHBA, PCSK2 and RXRG). Ingenuity and gene-set enrichment pathway analyses identified coenzyme A and fatty acid biosynthesis as biological processes related to menarche timing.  相似文献   

4.
Atopic dermatitis (AD) is a commonly occurring chronic skin disease with high heritability. Apart from filaggrin (FLG), the genes influencing atopic dermatitis are largely unknown. We conducted a genome-wide association meta-analysis of 5,606 affected individuals and 20,565 controls from 16 population-based cohorts and then examined the ten most strongly associated new susceptibility loci in an additional 5,419 affected individuals and 19,833 controls from 14 studies. Three SNPs reached genome-wide significance in the discovery and replication cohorts combined, including rs479844 upstream of OVOL1 (odds ratio (OR) = 0.88, P = 1.1 × 10(-13)) and rs2164983 near ACTL9 (OR = 1.16, P = 7.1 × 10(-9)), both of which are near genes that have been implicated in epidermal proliferation and differentiation, as well as rs2897442 in KIF3A within the cytokine cluster at 5q31.1 (OR = 1.11, P = 3.8 × 10(-8)). We also replicated association with the FLG locus and with two recently identified association signals at 11q13.5 (rs7927894; P = 0.008) and 20q13.33 (rs6010620; P = 0.002). Our results underline the importance of both epidermal barrier function and immune dysregulation in atopic dermatitis pathogenesis.  相似文献   

5.
Genome-wide association studies involving hundreds of thousands of SNPs in thousands of cases and controls are now underway. The first of many analytical challenges in these studies involves the choice of SNPs to genotype. It is not practical to construct a different panel of tag SNPs for each study, so the first generation of genome-wide scans will use predefined, commercially available marker panels, which will in part dictate their success or failure. We compare different approaches in use today, and show that although many of them provide substantial coverage of common variation in non-African populations, the precise extent is strongly dependent on the frequencies of alleles of interest and on specific considerations of study design. Overall, despite substantial differences in genotyping technologies, marker selection strategies and number of markers assayed, the first-generation high-throughput platforms all offer similar levels of genome coverage.  相似文献   

6.
7.
Multiple genetic variants have been associated with adult obesity and a few with severe obesity in childhood; however, less progress has been made in establishing genetic influences on common early-onset obesity. We performed a North American, Australian and European collaborative meta-analysis of 14 studies consisting of 5,530 cases (≥95th percentile of body mass index (BMI)) and 8,318 controls (<50th percentile of BMI) of European ancestry. Taking forward the eight newly discovered signals yielding association with P < 5 × 10(-6) in nine independent data sets (2,818 cases and 4,083 controls), we observed two loci that yielded genome-wide significant combined P values near OLFM4 at 13q14 (rs9568856; P = 1.82 × 10(-9); odds ratio (OR) = 1.22) and within HOXB5 at 17q21 (rs9299; P = 3.54 × 10(-9); OR = 1.14). Both loci continued to show association when two extreme childhood obesity cohorts were included (2,214 cases and 2,674 controls). These two loci also yielded directionally consistent associations in a previous meta-analysis of adult BMI(1).  相似文献   

8.
Genome-wide association studies (GWAS) are a standard approach for studying the genetics of natural variation. A major concern in GWAS is the need to account for the complicated dependence structure of the data, both between loci as well as between individuals. Mixed models have emerged as a general and flexible approach for correcting for population structure in GWAS. Here, we extend this linear mixed-model approach to carry out GWAS of correlated phenotypes, deriving a fully parameterized multi-trait mixed model (MTMM) that considers both the within-trait and between-trait variance components simultaneously for multiple traits. We apply this to data from a human cohort for correlated blood lipid traits from the Northern Finland Birth Cohort 1966 and show greatly increased power to detect pleiotropic loci that affect more than one blood lipid trait. We also apply this approach to an Arabidopsis thaliana data set for flowering measurements in two different locations, identifying loci whose effect depends on the environment.  相似文献   

9.
10.
We performed a genome-wide association study (GWAS) of Kawasaki disease in Japanese subjects using data from 428 individuals with Kawasaki disease (cases) and 3,379 controls genotyped at 473,803 SNPs. We validated the association results in two independent replication panels totaling 754 cases and 947 controls. We observed significant associations in the FAM167A-BLK region at 8p22-23 (rs2254546, P = 8.2 × 10(-21)), in the human leukocyte antigen (HLA) region at 6p21.3 (rs2857151, P = 4.6 × 10(-11)) and in the CD40 region at 20q13 (rs4813003, P = 4.8 × 10(-8)). We also replicated the association of a functional SNP of FCGR2A (rs1801274, P = 1.6 × 10(-6)) identified in a recently reported GWAS of Kawasaki disease. Our findings provide new insights into the pathogenesis and pathophysiology of Kawasaki disease.  相似文献   

11.
Graves' disease is a common autoimmune disorder characterized by thyroid stimulating hormone receptor autoantibodies (TRAb) and hyperthyroidism. To investigate the genetic architecture of Graves' disease, we conducted a genome-wide association study in 1,536 individuals with Graves' disease (cases) and 1,516 controls. We further evaluated a group of associated SNPs in a second set of 3,994 cases and 3,510 controls. We confirmed four previously reported loci (in the major histocompatibility complex, TSHR, CTLA4 and FCRL3) and identified two new susceptibility loci (the RNASET2-FGFR1OP-CCR6 region at 6q27 (P(combined) = 6.85 × 10(-10) for rs9355610) and an intergenic region at 4p14 (P(combined) = 1.08 × 10(-13) for rs6832151)). These newly associated SNPs were correlated with the expression levels of RNASET2 at 6q27, of CHRNA9 and of a previously uncharacterized gene at 4p14, respectively. Moreover, we identified strong associations of TSHR and major histocompatibility complex class II variants with persistently TRAb-positive Graves' disease.  相似文献   

12.
We conducted a three-stage genetic study to identify susceptibility loci for type 2 diabetes (T2D) in east Asian populations. We followed our stage 1 meta-analysis of eight T2D genome-wide association studies (6,952 cases with T2D and 11,865 controls) with a stage 2 in silico replication analysis (5,843 cases and 4,574 controls) and a stage 3 de novo replication analysis (12,284 cases and 13,172 controls). The combined analysis identified eight new T2D loci reaching genome-wide significance, which mapped in or near GLIS3, PEPD, FITM2-R3HDML-HNF4A, KCNK16, MAEA, GCC1-PAX4, PSMD6 and ZFAND3. GLIS3, which is involved in pancreatic beta cell development and insulin gene expression, is known for its association with fasting glucose levels. The evidence of an association with T2D for PEPD and HNF4A has been shown in previous studies. KCNK16 may regulate glucose-dependent insulin secretion in the pancreas. These findings, derived from an east Asian population, provide new perspectives on the etiology of T2D.  相似文献   

13.
Lin S  Chakravarti A  Cutler DJ 《Nature genetics》2004,36(11):1181-1188
Genome-wide disease-association mapping has been heralded as the study design of the next generation, but the lack of analytical methods to use genotype data fully is a large stumbling block. Here we describe an algorithm and statistical method that efficiently and exhaustively exploits haplotype information by subjecting alleles (a marker or contiguous sets of markers) from sliding windows of all sizes to transmission disequilibrium tests. By applying our method to simulated data and to Hirschsprung disease, we show that it can detect both common and rare disease variants of small effect. These results show that the theoretical benefits of genome-wide association studies are at last realizable.  相似文献   

14.
Population stratification--allele frequency differences between cases and controls due to systematic ancestry differences-can cause spurious associations in disease studies. We describe a method that enables explicit detection and correction of population stratification on a genome-wide scale. Our method uses principal components analysis to explicitly model ancestry differences between cases and controls. The resulting correction is specific to a candidate marker's variation in frequency across ancestral populations, minimizing spurious associations while maximizing power to detect true associations. Our simple, efficient approach can easily be applied to disease studies with hundreds of thousands of markers.  相似文献   

15.
To identify the genetic bases for nine metabolic traits, we conducted a meta-analysis combining Korean genome-wide association results from the KARE project (n = 8,842) and the HEXA shared control study (n = 3,703). We verified the associations of the loci selected from the discovery meta-analysis in the replication stage (30,395 individuals from the BioBank Japan genome-wide association study and individuals comprising the Health2 and Shanghai Jiao Tong University Diabetes cohorts). We identified ten genome-wide significant signals newly associated with traits from an overall meta-analysis. The most compelling associations involved 12q24.11 (near MYL2) and 12q24.13 (in C12orf51) for high-density lipoprotein cholesterol, 2p21 (near SIX2-SIX3) for fasting plasma glucose, 19q13.33 (in RPS11) and 6q22.33 (in RSPO3) for renal traits, and 12q24.11 (near MYL2), 12q24.13 (in C12orf51 and near OAS1), 4q31.22 (in ZNF827) and 7q11.23 (near TBL2-BCL7B) for hepatic traits. These findings highlight previously unknown biological pathways for metabolic traits investigated in this study.  相似文献   

16.
Lin Z  Bei JX  Shen M  Li Q  Liao Z  Zhang Y  Lv Q  Wei Q  Low HQ  Guo YM  Cao S  Yang M  Hu Z  Xu M  Wang X  Wei Y  Li L  Li C  Li T  Huang J  Pan Y  Jin O  Wu Y  Wu J  Guo Z  He P  Hu S  Wu H  Song H  Zhan F  Liu S  Gao G  Liu Z  Li Y  Xiao C  Li J  Ye Z  He W  Liu D  Shen L  Huang A  Wu H  Tao Y  Pan X  Yu B  Tai ES  Zeng YX  Ren EC  Shen Y  Liu J  Gu J 《Nature genetics》2012,44(1):73-77
To identify susceptibility loci for ankylosing spondylitis, we performed a two-stage genome-wide association study in Han Chinese. In the discovery stage, we analyzed 1,356,350 autosomal SNPs in 1,837 individuals with ankylosing spondylitis and 4,231 controls; in the validation stage, we analyzed 30 suggestive SNPs in an additional 2,100 affected individuals and 3,496 controls. We identified two new susceptibility loci between EDIL3 and HAPLN1 at 5q14.3 (rs4552569; P = 8.77 × 10(-10)) and within ANO6 at 12q12 (rs17095830; P = 1.63 × 10(-8)). We also confirmed previously reported associations in Europeans within the major histocompatibility complex (MHC) region (top SNP, rs13202464; P < 5 × 10(-324)) and at 2p15 (rs10865331; P = 1.98 × 10(-8)). We show that rs13202464 within the MHC region mainly represents the risk effect of HLA-B*27 variants (including HLA-B*2704, HLA-B*2705 and HLA-B*2715) in Chinese. The two newly discovered loci implicate genes related to bone formation and cartilage development, suggesting their potential involvement in the etiology of ankylosing spondylitis.  相似文献   

17.
Population structure causes genome-wide linkage disequilibrium between unlinked loci, leading to statistical confounding in genome-wide association studies. Mixed models have been shown to handle the confounding effects of a diffuse background of large numbers of loci of small effect well, but they do not always account for loci of larger effect. Here we propose a multi-locus mixed model as a general method for mapping complex traits in structured populations. Simulations suggest that our method outperforms existing methods in terms of power as well as false discovery rate. We apply our method to human and Arabidopsis thaliana data, identifying new associations and evidence for allelic heterogeneity. We also show how a priori knowledge from an A. thaliana linkage mapping study can be integrated into our method using a Bayesian approach. Our implementation is computationally efficient, making the analysis of large data sets (n > 10,000) practicable.  相似文献   

18.
To find new candidate loci predisposing individuals to Kawasaki disease, an acute vasculitis that affects children, we conducted a genome-wide association study in 622 individuals with Kawasaki disease (cases) and 1,107 controls in a Han Chinese population residing in Taiwan, with replication in an independent Han Chinese sample of 261 cases and 550 controls. We report two new loci, one at BLK (encoding B-lymphoid tyrosine kinase) and one at CD40, that are associated with Kawasaki disease at genome-wide significance (P < 5 × 10(-8)). Our findings may lead to a better understanding of the role of immune activation and inflammation in Kawasaki disease pathogenesis.  相似文献   

19.
Genome-wide association is a promising approach to identify common genetic variants that predispose to human disease. Because of the high cost of genotyping hundreds of thousands of markers on thousands of subjects, genome-wide association studies often follow a staged design in which a proportion (pi(samples)) of the available samples are genotyped on a large number of markers in stage 1, and a proportion (pi(samples)) of these markers are later followed up by genotyping them on the remaining samples in stage 2. The standard strategy for analyzing such two-stage data is to view stage 2 as a replication study and focus on findings that reach statistical significance when stage 2 data are considered alone. We demonstrate that the alternative strategy of jointly analyzing the data from both stages almost always results in increased power to detect genetic association, despite the need to use more stringent significance levels, even when effect sizes differ between the two stages. We recommend joint analysis for all two-stage genome-wide association studies, especially when a relatively large proportion of the samples are genotyped in stage 1 (pi(samples) >or= 0.30), and a relatively large proportion of markers are selected for follow-up in stage 2 (pi(markers) >or= 0.01).  相似文献   

20.
Lung adenocarcinoma is the most common histological type of lung cancer, and its incidence is increasing worldwide. To identify genetic factors influencing risk of lung adenocarcinoma, we conducted a genome-wide association study and two validation studies in the Japanese population comprising a total of 6,029 individuals with lung adenocarcinoma (cases) and 13,535 controls. We confirmed two previously reported risk loci, 5p15.33 (rs2853677, P(combined) = 2.8 × 10(-40), odds ratio (OR) = 1.41) and 3q28 (rs10937405, P(combined) = 6.9 × 10(-17), OR = 1.25), and identified two new susceptibility loci, 17q24.3 (rs7216064, P(combined) = 7.4 × 10(-11), OR = 1.20) and 6p21.3 (rs3817963, P(combined) = 2.7 × 10(-10), OR = 1.18). These data provide further evidence supporting a role for genetic susceptibility in the development of lung adenocarcinoma.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号