首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
The extent of linkage disequilibrium in Arabidopsis thaliana.   总被引:20,自引:0,他引:20  
Linkage disequilibrium (LD), the nonrandom occurrence of alleles in haplotypes, has long been of interest to population geneticists. Recently, the rapidly increasing availability of genomic polymorphism data has fueled interest in LD as a tool for fine-scale mapping, in particular for human disease loci. The chromosomal extent of LD is crucial in this context, because it determines how dense a map must be for associations to be detected and, conversely, limits how finely loci may be mapped. Arabidopsis thaliana is expected to harbor unusually extensive LD because of its high degree of selfing. Several polymorphism studies have found very strong LD within individual loci, but also evidence of some recombination. Here we investigate the pattern of LD on a genomic scale and show that in global samples, LD decays within approximately 1 cM, or 250 kb. We also show that LD in local populations may be much stronger than that of global populations, presumably as a result of founder events. The combination of a relatively high level of polymorphism and extensive haplotype structure bodes well for developing a genome-wide LD map in A. thaliana.  相似文献   

2.
Linkage disequilibrium (LD), or the non-random association of alleles, is poorly understood in the human genome. Population genetic theory suggests that LD is determined by the age of the markers, population history, recombination rate, selection and genetic drift. Despite the uncertainties in determining the relative contributions of these factors, some groups have argued that LD is a simple function of distance between markers. Disease-gene mapping studies and a simulation study gave differing predictions on the degree of LD in isolated and general populations. In view of the discrepancies between theory and experimental observations, we constructed a high-density SNP map of the Xq25-Xq28 region and analysed the male genotypes and haplotypes across this region for LD in three populations. The populations included an outbred European sample (CEPH males) and isolated population samples from Finland and Sardinia. We found two extended regions of strong LD bracketed by regions with no evidence for LD in all three samples. Haplotype analysis showed a paucity of haplotypes in regions of strong LD. Our results suggest that, in this region of the X chromosome, LD is not a monotonic function of the distance between markers, but is more a property of the particular location in the human genome.  相似文献   

3.
Recent genomic surveys have produced high-resolution haplotype information, but only in a small number of human populations. We report haplotype structure across 12 Mb of DNA sequence in 927 individuals representing 52 populations. The geographic distribution of haplotypes reflects human history, with a loss of haplotype diversity as distance increases from Africa. Although the extent of linkage disequilibrium (LD) varies markedly across populations, considerable sharing of haplotype structure exists, and inferred recombination hotspot locations generally match across groups. The four samples in the International HapMap Project contain the majority of common haplotypes found in most populations: averaging across populations, 83% of common 20-kb haplotypes in a population are also common in the most similar HapMap sample. Consequently, although the portability of tag SNPs based on the HapMap is reduced in low-LD Africans, the HapMap will be helpful for the design of genome-wide association mapping studies in nearly all human populations.  相似文献   

4.
Haplotype tagging for the identification of common disease genes   总被引:61,自引:0,他引:61  
Genome-wide linkage disequilibrium (LD) mapping of common disease genes could be more powerful than linkage analysis if the appropriate density of polymorphic markers were known and if the genotyping effort and cost of producing such an LD map could be reduced. Although different metrics that measure the extent of LD have been evaluated, even the most recent studies have not placed significant emphasis on the most informative and cost-effective method of LD mapping-that based on haplotypes. We have scanned 135 kb of DNA from nine genes, genotyped 122 single-nucleotide polymorphisms (SNPs; approximately 184,000 genotypes) and determined the common haplotypes in a minimum of 384 European individuals for each gene. Here we show how knowledge of the common haplotypes and the SNPs that tag them can be used to (i) explain the often complex patterns of LD between adjacent markers, (ii) reduce genotyping significantly (in this case from 122 to 34 SNPs), (iii) scan the common variation of a gene sensitively and comprehensively and (iv) provide key fine-mapping data within regions of strong LD. Our results also indicate that, at least for the genes studied here, the current version of dbSNP would have been of limited utility for LD mapping because many common haplotypes could not be defined. A directed re-sequencing effort of the approximately 10% of the genome in or near genes in the major ethnic groups would aid the systematic evaluation of the common variant model of common disease.  相似文献   

5.
Recent studies of human populations suggest that the genome consists of chromosome segments that are ancestrally conserved ('haplotype blocks'; refs. 1-3) and have discrete boundaries defined by recombination hot spots. Using publicly available genetic markers, we have constructed a first-generation haplotype map of chromosome 19. As expected for this marker density, approximately one-third of the chromosome is encompassed within haplotype blocks. Evolutionary modeling of the data indicates that recombination hot spots are not required to explain most of the observed blocks, providing that marker ascertainment and the observed marker spacing are considered. In contrast, several long blocks are inconsistent with our evolutionary models, and different mechanisms could explain their origins.  相似文献   

6.
There is considerable interest in understanding patterns of linkage disequilibrium (LD) in the human genome, to aid investigations of human evolution and facilitate association studies in complex disease. The relative influences of meiotic crossover distribution and population history on LD remain unclear, however. In particular, it is uncertain to what extent crossovers are clustered into 'hot spots, that might influence LD patterns. As a first step to investigating the relationship between LD and recombination, we have analyzed a 216-kb segment of the class II region of the major histocompatibility complex (MHC) already characterized for familial crossovers. High-resolution LD analysis shows the existence of extended domains of strong association interrupted by patchwork areas of LD breakdown. Sperm typing shows that these areas correspond precisely to meiotic crossover hot spots. All six hot spots defined share a remarkably similar symmetrical morphology but vary considerably in intensity, and are not obviously associated with any primary DNA sequence determinants of hot-spot activity. These hot spots occur in clusters and together account for almost all crossovers in this region of the MHC. These data show that, within the MHC at least, crossovers are far from randomly distributed at the molecular level and that recombination hot spots can profoundly affect LD patterns.  相似文献   

7.
The fine-scale distribution of meiotic recombination events in the human genome can be inferred from patterns of haplotype diversity in human populations but directly studied only by high-resolution sperm typing. Both approaches indicate that crossovers are heavily clustered into narrow recombination hot spots. But our direct understanding of hot-spot properties and distributions is largely limited to sperm typing in the major histocompatibility complex (MHC). We now describe the analysis of an unremarkable 206-kb region on human chromosome 1, which identified localized regions of linkage disequilibrium breakdown that mark the locations of sperm crossover hot spots. The distribution, intensity and morphology of these hot spots are markedly similar to those in the MHC. But we also accidentally detected additional hot spots in regions of strong association. Coalescent analysis of genotype data detected most of the hot spots but showed significant differences between sperm crossover frequencies and historical recombination rates. This raises the possibility that some hot spots, particularly those in regions of strong association, may have evolved very recently and not left their full imprint on haplotype diversity. These results suggest that hot spots could be very abundant and possibly fluid features of the human genome.  相似文献   

8.
Crossover between the human sex chromosomes during male meiosis is restricted to the terminal pseudoautosomal pairing regions. An obligatory exchange occurs in PAR1, an Xp/Yp pseudoautosomal region of 2.6 Mb, which creates a male-specific recombination 'hot domain' with a recombination rate that is about 20 times higher than the genome average. Low-resolution analysis of PAR1 suggests that crossovers are distributed fairly randomly. By contrast, linkage disequilibrium (LD) and sperm crossover analyses indicate that crossovers in autosomal regions tend to cluster into 'hot spots' of 1-2 kb that lie between islands of disequilibrium of tens to hundreds of kilobases. To determine whether at high resolution this autosomal pattern also applies to PAR1, we have examined linkage disequilibrium over an interval of 43 kb around the gene SHOX. Here we show that in northern European populations, disequilibrium decays rapidly with physical distance, which is consistent with this interval of PAR1 being recombinationally active in male meiosis. Analysis of a subregion of 9.9 kb in sperm shows, however, that crossovers are not distributed randomly, but instead cluster into an intense recombination hot spot that is very similar in morphology to autosomal hot spots. Thus, PAR1 crossover activity may be influenced by male-specific hot spots that are highly suitable for characterization by sperm DNA analysis.  相似文献   

9.
Large scale sequencing of cDNAs provides a complementary approach to structural analysis of the human genome by generating expressed sequence tags (ESTs). We have initiated the large-scale sequencing of a 3'-directed cDNA library from the human liver cell line HepG2, that is a non-biased representation of the mRNA population. 982 random cDNA clones were sequenced yielding more than 270 kilobases. A significant portion of the identified genes encoded secretable proteins and components for protein-synthesis. The abundance of cDNA species varied from 2.2% to less than 0.004%. Fifty two percent of the mRNA were abundant species consisting of 173 genes and the rest were non-abundant, consisting of about 6,600 genes.  相似文献   

10.
A complete BAC-based physical map of the Arabidopsis thaliana genome.   总被引:11,自引:0,他引:11  
Arabidopsis thaliana is a small flowering plant that serves as the major model system in plant molecular genetics. The efforts of many scientists have produced genetic maps that provide extensive coverage of the genome (http://genome-www. stanford.edu/Arabidopsis/maps.html). Recently, detailed YAC, BAC, P1 and cosmid-based physical maps (that is, representations of genomic regions as sets of overlapping clones of corresponding libraries) have been established that extend over wide genomic areas ranging from several hundreds of kilobases to entire chromosomes. These maps provide an entry to gain deeper insight into the A. thaliana genome structure. A. thaliana has been chosen as the subject of the first large-scale project intended to determine the full genome sequence of a plant. This sequencing project, together with the increasing interest in map-based gene cloning, has highlighted the requirement for a complete and accurate physical map of this plant species. To supply the scientific community with a high-quality resource, we present here a complete physical map of A. thaliana using essentially the IGF BAC library. The map consists of 27 contigs that cover the entire genome, except for the presumptive centromeric regions, nucleolar organization regions (NOR) and telomeric areas. This is the first reported map of a complex organism based entirely on BAC clones and it represents the most homogeneous and complete physical map established to date for any plant genome. Furthermore, the analysis performed here serves as a model for an efficient physical mapping procedure using BAC clones that can be applied to other complex genomes.  相似文献   

11.
12.
The budding yeast Saccharomyces cerevisiae has been used by humans for millennia to make wine, beer and bread. More recently, it became a key model organism for studies of eukaryotic biology and for genomic analysis. However, relatively little is known about the natural lifestyle and population genetics of yeast. One major question is whether genetically diverse yeast strains mate and recombine in the wild. We developed a method to infer the evolutionary history of a species from genome sequences of multiple individuals and applied it to whole-genome sequence data from three strains of Saccharomyces cerevisiae and the sister species Saccharomyces paradoxus. We observed a pattern of sequence variation among yeast strains in which ancestral recombination events lead to a mosaic of segments with shared genealogy. Based on sequence divergence and the inferred median size of shared segments (approximately 2,000 bp), we estimated that although any two strains have undergone approximately 16 million cell divisions since their last common ancestor, only 314 outcrossing events have occurred during this time (roughly one every 50,000 divisions). Local correlations in polymorphism rates indicate that linkage disequilibrium in yeast should extend over kilobases. Our results provide the initial foundation for population studies of association between genotype and phenotype in S. cerevisiae.  相似文献   

13.
L Kruglyak 《Nature genetics》1999,22(2):139-144
Recently, attention has focused on the use of whole-genome linkage disequilibrium (LD) studies to map common disease genes. Such studies would employ a dense map of single nucleotide polymorphisms (SNPs) to detect association between a marker and disease. Construction of SNP maps is currently underway. An essential issue yet to be settled is the required marker density of such maps. Here, I use population simulations to estimate the extent of LD surrounding common gene variants in the general human population as well as in isolated populations. Two main conclusions emerge from these investigations. First, a useful level of LD is unlikely to extend beyond an average distance of roughly 3 kb in the general population, which implies that approximately 500,000 SNPs will be required for whole-genome studies. Second, the extent of LD is similar in isolated populations unless the founding bottleneck is very narrow or the frequency of the variant is low (<5%).  相似文献   

14.
Here we provide a genome-wide, high-resolution map of the phylogenetic origin of the genome of most extant laboratory mouse inbred strains. Our analysis is based on the genotypes of wild-caught mice from three subspecies of Mus musculus. We show that classical laboratory strains are derived from a few fancy mice with limited haplotype diversity. Their genomes are overwhelmingly Mus musculus domesticus in origin, and the remainder is mostly of Japanese origin. We generated genome-wide haplotype maps based on identity by descent from fancy mice and show that classical inbred strains have limited and non-randomly distributed genetic diversity. In contrast, wild-derived laboratory strains represent a broad sampling of diversity within M. musculus. Intersubspecific introgression is pervasive in these strains, and contamination by laboratory stocks has played a role in this process. The subspecific origin, haplotype diversity and identity by descent maps can be visualized using the Mouse Phylogeny Viewer (see URLs).  相似文献   

15.
Exogenous double-stranded RNA (dsRNA) has been shown to exert homology-dependent effects at the level of both target mRNA stability and chromatin structure. Using C. elegans undergoing RNAi as an animal model, we have investigated the generality, scope and longevity of dsRNA-targeted chromatin effects and their dependence on components of the RNAi machinery. Using high-resolution genome-wide chromatin profiling, we found that a diverse set of genes can be induced to acquire locus-specific enrichment of histone H3 lysine 9 trimethylation (H3K9me3), with modification footprints extending several kilobases from the site of dsRNA homology and with locus specificity sufficient to distinguish the targeted locus from the other 20,000 genes in the C. elegans genome. Genetic analysis of the response indicated that factors responsible for secondary siRNA production during RNAi were required for effective targeting of chromatin. Temporal analysis revealed that H3K9me3, once triggered by dsRNA, can be maintained in the absence of dsRNA for at least two generations before being lost. These results implicate dsRNA-triggered chromatin modification in C. elegans as a programmable and locus-specific response defining a metastable state that can persist through generational boundaries.  相似文献   

16.
Nested chromosomal deletions induced with retroviral vectors in mice   总被引:9,自引:0,他引:9  
Su H  Wang X  Bradley A 《Nature genetics》2000,24(1):92-95
Chromosomal deletions, especially nested deletions, are major genetic tools in diploid organisms that facilitate the functional analysis of large chromosomal regions and allow the rapid localization of mutations to specific genetic intervals. In mice, well-characterized overlapping deletions are only available at a few chromosomal loci, partly due to drawbacks of existing methods. Here we exploit the random integration of a retrovirus to generate high-resolution sets of nested deletions around defined loci in embryonic stem (ES) cells, with sizes extending from a few kilobases to several megabases. This approach expands the application of Cre-loxP-based chromosome engineering because it not only allows the construction of hundreds of overlapping deletions, but also provides molecular entry points to regions based on the retroviral tags. Our approach can be extended to any region of the mouse genome.  相似文献   

17.
Whole-genome sequences provide a rich source of information about human evolution. Here we describe an effort to estimate key evolutionary parameters based on the whole-genome sequences of six individuals from diverse human populations. We used a Bayesian, coalescent-based approach to obtain information about ancestral population sizes, divergence times and migration rates from inferred genealogies at many neutrally evolving loci across the genome. We introduce new methods for accommodating gene flow between populations and integrating over possible phasings of diploid genotypes. We also describe a custom pipeline for genotype inference to mitigate biases from heterogeneous sequencing technologies and coverage levels. Our analysis indicates that the San population of southern Africa diverged from other human populations approximately 108-157 thousand years ago, that Eurasians diverged from an ancestral African population 38-64 thousand years ago, and that the effective population size of the ancestors of all modern humans was ~9,000.  相似文献   

18.
Emerging technologies make it possible for the first time to genotype hundreds of thousands of SNPs simultaneously, enabling whole-genome association studies. Using empirical genotype data from the International HapMap Project, we evaluate the extent to which the sets of SNPs contained on three whole-genome genotyping arrays capture common SNPs across the genome, and we find that the majority of common SNPs are well captured by these products either directly or through linkage disequilibrium. We explore analytical strategies that use HapMap data to improve power of association studies conducted with these fixed sets of markers and show that limited inclusion of specific haplotype tests in association analysis can increase the fraction of common variants captured by 25-100%. Finally, we introduce a Bayesian approach to association analysis by weighting the likelihood of each statistical test to reflect the number of putative causal alleles to which it is correlated.  相似文献   

19.
Detecting genetic variants that are highly divergent from a reference sequence remains a major challenge in genome sequencing. We introduce de novo assembly algorithms using colored de Bruijn graphs for detecting and genotyping simple and complex genetic variants in an individual or population. We provide an efficient software implementation, Cortex, the first de novo assembler capable of assembling multiple eukaryotic genomes simultaneously. Four applications of Cortex are presented. First, we detect and validate both simple and complex structural variations in a high-coverage human genome. Second, we identify more than 3 Mb of sequence absent from the human reference genome, in pooled low-coverage population sequence data from the 1000 Genomes Project. Third, we show how population information from ten chimpanzees enables accurate variant calls without a reference sequence. Last, we estimate classical human leukocyte antigen (HLA) genotypes at HLA-B, the most variable gene in the human genome.  相似文献   

20.
Genome-wide association studies of 14 agronomic traits in rice landraces   总被引:20,自引:0,他引:20  
Huang X  Wei X  Sang T  Zhao Q  Feng Q  Zhao Y  Li C  Zhu C  Lu T  Zhang Z  Li M  Fan D  Guo Y  Wang A  Wang L  Deng L  Li W  Lu Y  Weng Q  Liu K  Huang T  Zhou T  Jing Y  Li W  Lin Z  Buckler ES  Qian Q  Zhang QF  Li J  Han B 《Nature genetics》2010,42(11):961-967
Uncovering the genetic basis of agronomic traits in crop landraces that have adapted to various agro-climatic conditions is important to world food security. Here we have identified ~ 3.6 million SNPs by sequencing 517 rice landraces and constructed a high-density haplotype map of the rice genome using a novel data-imputation method. We performed genome-wide association studies (GWAS) for 14 agronomic traits in the population of Oryza sativa indica subspecies. The loci identified through GWAS explained ~ 36% of the phenotypic variance, on average. The peak signals at six loci were tied closely to previously identified genes. This study provides a fundamental resource for rice genetics research and breeding, and demonstrates that an approach integrating second-generation genome sequencing and GWAS can be used as a powerful complementary strategy to classical biparental cross-mapping for dissecting complex traits in rice.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号