首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Extremely low-coverage sequencing and imputation increases power for genome-wide association studies
Authors:Pasaniuc Bogdan  Rohland Nadin  McLaren Paul J  Garimella Kiran  Zaitlen Noah  Li Heng  Gupta Namrata  Neale Benjamin M  Daly Mark J  Sklar Pamela  Sullivan Patrick F  Bergen Sarah  Moran Jennifer L  Hultman Christina M  Lichtenstein Paul  Magnusson Patrik  Purcell Shaun M  Haas David W  Liang Liming  Sunyaev Shamil  Patterson Nick  de Bakker Paul I W  Reich David  Price Alkes L
Institution:Department of Epidemiology, Harvard School of Public Health, Boston, Massachusetts, USA. bpasaniu@hsph.harvard.edu
Abstract:Genome-wide association studies (GWAS) have proven to be a powerful method to identify common genetic variants contributing to susceptibility to common diseases. Here, we show that extremely low-coverage sequencing (0.1-0.5×) captures almost as much of the common (>5%) and low-frequency (1-5%) variation across the genome as SNP arrays. As an empirical demonstration, we show that genome-wide SNP genotypes can be inferred at a mean r(2) of 0.71 using off-target data (0.24× average coverage) in a whole-exome study of 909 samples. Using both simulated and real exome-sequencing data sets, we show that association statistics obtained using extremely low-coverage sequencing data attain similar P values at known associated variants as data from genotyping arrays, without an excess of false positives. Within the context of reductions in sample preparation and sequencing costs, funds invested in extremely low-coverage sequencing can yield several times the effective sample size of GWAS based on SNP array data and a commensurate increase in statistical power.
Keywords:
本文献已被 PubMed 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号