首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 140 毫秒
1.
Genome-wide association studies (GWAS) have proven to be a powerful method to identify common genetic variants contributing to susceptibility to common diseases. Here, we show that extremely low-coverage sequencing (0.1-0.5×) captures almost as much of the common (>5%) and low-frequency (1-5%) variation across the genome as SNP arrays. As an empirical demonstration, we show that genome-wide SNP genotypes can be inferred at a mean r(2) of 0.71 using off-target data (0.24× average coverage) in a whole-exome study of 909 samples. Using both simulated and real exome-sequencing data sets, we show that association statistics obtained using extremely low-coverage sequencing data attain similar P values at known associated variants as data from genotyping arrays, without an excess of false positives. Within the context of reductions in sample preparation and sequencing costs, funds invested in extremely low-coverage sequencing can yield several times the effective sample size of GWAS based on SNP array data and a commensurate increase in statistical power.  相似文献   

2.
Genome-wide association studies are set to become the method of choice for uncovering the genetic basis of human diseases. A central challenge in this area is the development of powerful multipoint methods that can detect causal variants that have not been directly genotyped. We propose a coherent analysis framework that treats the problem as one involving missing or uncertain genotypes. Central to our approach is a model-based imputation method for inferring genotypes at observed or unobserved SNPs, leading to improved power over existing methods for multipoint association mapping. Using real genome-wide association study data, we show that our approach (i) is accurate and well calibrated, (ii) provides detailed views of associated regions that facilitate follow-up studies and (iii) can be used to validate and correct data at genotyped markers. A notable future use of our method will be to boost power by combining data from genome-wide scans that use different SNP sets.  相似文献   

3.
Population structure causes genome-wide linkage disequilibrium between unlinked loci, leading to statistical confounding in genome-wide association studies. Mixed models have been shown to handle the confounding effects of a diffuse background of large numbers of loci of small effect well, but they do not always account for loci of larger effect. Here we propose a multi-locus mixed model as a general method for mapping complex traits in structured populations. Simulations suggest that our method outperforms existing methods in terms of power as well as false discovery rate. We apply our method to human and Arabidopsis thaliana data, identifying new associations and evidence for allelic heterogeneity. We also show how a priori knowledge from an A. thaliana linkage mapping study can be integrated into our method using a Bayesian approach. Our implementation is computationally efficient, making the analysis of large data sets (n > 10,000) practicable.  相似文献   

4.
The hindbrain roof plate and choroid plexus are essential organizing centers for inducing dorsal neuron fates and sustaining neuron function. To map the formation of these structures, we developed a broadly applicable, high resolution, recombinase-based method for mapping the fate of cells originating from coordinates defined by intersecting combinations of expressed genes. Using this method, we show that distinct regions of hindbrain roof plate originate from discrete subdomains of rhombencephalic neuroectoderm expressing Wnt1; that choroid plexus, a secretory epithelium important for patterning later-formed hindbrain structures and maintaining neuron function, derives from the same embryonic primordium as the hindbrain roof plate; and that, unlike the floor plate, these dorsal organizing centers develop in a patterned, segmental manner, built from lineage-restricted compartments. Our data suggest that the roof plate and choroid plexus may be formed of functional units that are capable of differentially organizing the generation of distinct neuronal cell types at different axial levels.  相似文献   

5.
Although experimental and theoretical efforts have been applied to globally map genetic interactions, we still do not understand how gene-gene interactions arise from the operation of biomolecular networks. To bridge the gap between empirical and computational studies, we i, quantitatively measured genetic interactions between ~185,000 metabolic gene pairs in Saccharomyces cerevisiae, ii, superposed the data on a detailed systems biology model of metabolism and iii, introduced a machine-learning method to reconcile empirical interaction data with model predictions. We systematically investigated the relative impacts of functional modularity and metabolic flux coupling on the distribution of negative and positive genetic interactions. We also provide a mechanistic explanation for the link between the degree of genetic interaction, pleiotropy and gene dispensability. Last, we show the feasibility of automated metabolic model refinement by correcting misannotations in NAD biosynthesis and confirming them by in vivo experiments.  相似文献   

6.
7.
8.
Characterizing genetic diversity within and between populations has broad applications in studies of human disease and evolution. We propose a new approach, spatial ancestry analysis, for the modeling of genotypes in two- or three-dimensional space. In spatial ancestry analysis (SPA), we explicitly model the spatial distribution of each SNP by assigning an allele frequency as a continuous function in geographic space. We show that the explicit modeling of the allele frequency allows individuals to be localized on the map on the basis of their genetic information alone. We apply our SPA method to a European and a worldwide population genetic variation data set and identify SNPs showing large gradients in allele frequency, and we suggest these as candidate regions under selection. These regions include SNPs in the well-characterized LCT region, as well as at loci including FOXP2, OCA2 and LRP1B.  相似文献   

9.
Merlin--rapid analysis of dense genetic maps using sparse gene flow trees.   总被引:32,自引:0,他引:32  
Efforts to find disease genes using high-density single-nucleotide polymorphism (SNP) maps will produce data sets that exceed the limitations of current computational tools. Here we describe a new, efficient method for the analysis of dense genetic maps in pedigree data that provides extremely fast solutions to common problems such as allele-sharing analyses and haplotyping. We show that sparse binary trees represent patterns of gene flow in general pedigrees in a parsimonious manner, and derive a family of related algorithms for pedigree traversal. With these trees, exact likelihood calculations can be carried out efficiently for single markers or for multiple linked markers. Using an approximate multipoint calculation that ignores the unlikely possibility of a large number of recombinants further improves speed and provides accurate solutions in dense maps with thousands of markers. Our multipoint engine for rapid likelihood inference (Merlin) is a computer program that uses sparse inheritance trees for pedigree analysis; it performs rapid haplotyping, genotype error detection and affected pair linkage analyses and can handle more markers than other pedigree analysis packages.  相似文献   

10.
Gu Z  Rifkin SA  White KP  Li WH 《Nature genetics》2004,36(6):577-579
Using microarray gene expression data from several Drosophila species and strains, we show that duplicated genes, compared with single-copy genes, significantly increase gene expression diversity during development. We show further that duplicate genes tend to cause expression divergences between Drosophila species (or strains) to evolve faster than do single-copy genes. This conclusion is also supported by data from different yeast strains.  相似文献   

11.
R-spondins are a recently characterized small family of growth factors. Here we show that human R-spondin1 (RSPO1) is the gene disrupted in a recessive syndrome characterized by XX sex reversal, palmoplantar hyperkeratosis and predisposition to squamous cell carcinoma of the skin. Our data show, for the first time, that disruption of a single gene can lead to complete female-to-male sex reversal in the absence of the testis-determining gene, SRY.  相似文献   

12.
We present an approximate conditional and joint association analysis that can use summary-level statistics from a meta-analysis of genome-wide association studies (GWAS) and estimated linkage disequilibrium (LD) from a reference sample with individual-level genotype data. Using this method, we analyzed meta-analysis summary data from the GIANT Consortium for height and body mass index (BMI), with the LD structure estimated from genotype data in two independent cohorts. We identified 36 loci with multiple associated variants for height (38 leading and 49 additional SNPs, 87 in total) via a genome-wide SNP selection procedure. The 49 new SNPs explain approximately 1.3% of variance, nearly doubling the heritability explained at the 36 loci. We did not find any locus showing multiple associated SNPs for BMI. The method we present is computationally fast and is also applicable to case-control data, which we demonstrate in an example from meta-analysis of type 2 diabetes by the DIAGRAM Consortium.  相似文献   

13.
14.
A high-resolution survey of deletion polymorphism in the human genome   总被引:20,自引:0,他引:20  
Recent work has shown that copy number polymorphism is an important class of genetic variation in human genomes. Here we report a new method that uses SNP genotype data from parent-offspring trios to identify polymorphic deletions. We applied this method to data from the International HapMap Project to produce the first high-resolution population surveys of deletion polymorphism. Approximately 100 of these deletions have been experimentally validated using comparative genome hybridization on tiling-resolution oligonucleotide microarrays. Our analysis identifies a total of 586 distinct regions that harbor deletion polymorphisms in one or more of the families. Notably, we estimate that typical individuals are hemizygous for roughly 30-50 deletions larger than 5 kb, totaling around 550-750 kb of euchromatic sequence across their genomes. The detected deletions span a total of 267 known and predicted genes. Overall, however, the deleted regions are relatively gene-poor, consistent with the action of purifying selection against deletions. Deletion polymorphisms may well have an important role in the genetics of complex traits; however, they are not directly observed in most current gene mapping studies. Our new method will permit the identification of deletion polymorphisms in high-density SNP surveys of trio or other family data.  相似文献   

15.
Emerging technologies make it possible for the first time to genotype hundreds of thousands of SNPs simultaneously, enabling whole-genome association studies. Using empirical genotype data from the International HapMap Project, we evaluate the extent to which the sets of SNPs contained on three whole-genome genotyping arrays capture common SNPs across the genome, and we find that the majority of common SNPs are well captured by these products either directly or through linkage disequilibrium. We explore analytical strategies that use HapMap data to improve power of association studies conducted with these fixed sets of markers and show that limited inclusion of specific haplotype tests in association analysis can increase the fraction of common variants captured by 25-100%. Finally, we introduce a Bayesian approach to association analysis by weighting the likelihood of each statistical test to reflect the number of putative causal alleles to which it is correlated.  相似文献   

16.
Schadt EE  Woo S  Hao K 《Nature genetics》2012,44(5):603-608
RNA profiling can be used to capture the expression patterns of many genes that are associated with expression quantitative trait loci (eQTLs). Employing published putative cis eQTLs, we developed a Bayesian approach to predict SNP genotypes that is based only on RNA expression data. We show that predicted genotypes can accurately and uniquely identify individuals in large populations. When inferring genotypes from an expression data set using eQTLs of the same tissue type (but from an independent cohort), we were able to resolve 99% of the identities of individuals in the cohort at P(adjusted) ≤ 1 × 10(-5). When eQTLs derived from one tissue were used to predict genotypes using expression data from a different tissue, the identities of 90% of the study subjects could be resolved at P(adjusted) ≤ 1 × 10(-5). We discuss the implications of deriving genotypic information from RNA data deposited in the public domain.  相似文献   

17.
Revealing modular organization in the yeast transcriptional network   总被引:21,自引:0,他引:21  
  相似文献   

18.
It is often supposed that, except for tandem duplicates, genes are randomly distributed throughout the human genome. However, recent analyses suggest that when all the genes expressed in a given tissue (notably placenta and skeletal muscle) are examined, these genes do not map to random locations but instead resolve to clusters. We have asked three questions: (i) is this clustering true for most tissues, or are these the exceptions; (ii) is any clustering simply the result of the expression of tandem duplicates and (iii) how, if at all, does this relate to the observed clustering of genes with high expression rates? We provide a unified model of gene clustering that explains the previous observations. We examined Serial Analysis of Gene Expression (SAGE) data for 14 tissues and found significant clustering, in each tissue, that persists even after the removal of tandem duplicates. We confirmed clustering by analysis of independent expressed-sequence tag (EST) data. We then tested the possibility that the human genome is organized into subregions, each specializing in genes needed in a given tissue. By comparing genes expressed in different tissues, we show that this is not the case: those genes that seem to be tissue-specific in their expression do not, as a rule, cluster. We report that genes that are expressed in most tissues (housekeeping genes) show strong clustering. In addition, we show that the apparent clustering of genes with high expression rates is a consequence of the clustering of housekeeping genes.  相似文献   

19.
Natural selection on human microRNA binding sites inferred from SNP data   总被引:1,自引:0,他引:1  
Chen K  Rajewsky N 《Nature genetics》2006,38(12):1452-1456
  相似文献   

20.
Toward simpler and faster genome-wide mutagenesis in mice   总被引:8,自引:0,他引:8  
Wu S  Ying G  Wu Q  Capecchi MR 《Nature genetics》2007,39(7):922-930
Here we describe a practical Cre-loxP and piggyBac transposon-based mutagenesis strategy to systematically mutate coding sequences and/or the vast noncoding regions of the mouse genome for large-scale functional genomic analysis. To illustrate this approach, we first created loxP-containing loss-of-function alleles in the protocadherin alpha, beta and gamma gene clusters (Pcdha, Pcdhb and Pcdhg). Using these alleles, we show that, under proper guidance, Cre-loxP site-specific recombination can mediate efficient trans-allelic recombination in vivo, facilitating the generation of large germline deletions and duplications including deletions of Pcdha, and Pcdha to Pcdhb, simply by breeding (that is, at frequencies of 5.5%-21.6%). The same breeding method can also generate designed germline translocations between nonhomologous chromosomes at unexpected frequencies of greater than 1%. By incorporating a piggyBac transposon to insert and to distribute loxP sites randomly throughout the mouse genome, we present a simple but comprehensive method for generating genome-wide deletions and duplications, in addition to insertional loss-of-function and conditional rescue alleles, again simply by breeding.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号