首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 373 毫秒
1.
The ability to detect recent natural selection in the human population would have profound implications for the study of human history and for medicine. Here, we introduce a framework for detecting the genetic imprint of recent positive selection by analysing long-range haplotypes in human populations. We first identify haplotypes at a locus of interest (core haplotypes). We then assess the age of each core haplotype by the decay of its association to alleles at various distances from the locus, as measured by extended haplotype homozygosity (EHH). Core haplotypes that have unusually high EHH and a high population frequency indicate the presence of a mutation that rose to prominence in the human gene pool faster than expected under neutral evolution. We applied this approach to investigate selection at two genes carrying common variants implicated in resistance to malaria: G6PD and CD40 ligand. At both loci, the core haplotypes carrying the proposed protective mutation stand out and show significant evidence of selection. More generally, the method could be used to scan the entire genome for evidence of recent positive selection.  相似文献   

2.
We describe a map of 1.42 million single nucleotide polymorphisms (SNPs) distributed throughout the human genome, providing an average density on available sequence of one SNP every 1.9 kilobases. These SNPs were primarily discovered by two projects: The SNP Consortium and the analysis of clone overlaps by the International Human Genome Sequencing Consortium. The map integrates all publicly available SNPs with described genes and other genomic features. We estimate that 60,000 SNPs fall within exon (coding and untranslated regions), and 85% of exons are within 5 kb of the nearest SNP. Nucleotide diversity varies greatly across the genome, in a manner broadly consistent with a standard population genetic model of human history. This high-density SNP map provides a public resource for defining haplotype variation across the genome, and should help to identify biomedically important genes for diagnosis and therapy.  相似文献   

3.
The use of comparative genomics to infer genome function relies on the understanding of how different components of the genome change over evolutionary time. The aim of such comparative analysis is to identify conserved, functionally transcribed sequences such as protein-coding genes and non-coding RNA genes, and other functional sequences such as regulatory regions, as well as other genomic features. Here, we have compared the entire human chromosome 21 with syntenic regions of the mouse genome, and have identified a large number of conserved blocks of unknown function. Although previous studies have made similar observations, it is unknown whether these conserved sequences are genes or not. Here we present an extensive experimental and computational analysis of human chromosome 21 in an effort to assign function to sequences conserved between human chromosome 21 (ref. 8) and the syntenic mouse regions. Our data support the presence of a large number of potentially functional non-genic sequences, probably regulatory and structural. The integration of the properties of the conserved components of human chromosome 21 to the rapidly accumulating functional data for this chromosome will improve considerably our understanding of the role of sequence conservation in mammalian genomes.  相似文献   

4.
A dense map of genetic variation in the laboratory mouse genome will provide insights into the evolutionary history of the species and lead to an improved understanding of the relationship between inter-strain genotypic and phenotypic differences. Here we resequence the genomes of four wild-derived and eleven classical strains. We identify 8.27 million high-quality single nucleotide polymorphisms (SNPs) densely distributed across the genome, and determine the locations of the high (divergent subspecies ancestry) and low (common subspecies ancestry) SNP-rate intervals for every pairwise combination of classical strains. Using these data, we generate a genome-wide haplotype map containing 40,898 segments, each with an average of three distinct ancestral haplotypes. For the haplotypes in the classical strains that are unequivocally assigned ancestry, the genetic contributions of the Mus musculus subspecies--M. m. domesticus, M. m. musculus, M. m. castaneus and the hybrid M. m. molossinus--are 68%, 6%, 3% and 10%, respectively; the remaining 13% of haplotypes are of unknown ancestral origin. The considerable regional redundancy of the SNP data will facilitate imputation of the majority of these genotypes in less-densely typed classical inbred strains to provide a complete view of variation in additional strains.  相似文献   

5.
An SNP map of human chromosome 22   总被引:35,自引:0,他引:35  
The human genome sequence will provide a reference for measuring DNA sequence variation in human populations. Sequence variants are responsible for the genetic component of individuality, including complex characteristics such as disease susceptibility and drug response. Most sequence variants are single nucleotide polymorphisms (SNPs), where two alternate bases occur at one position. Comparison of any two genomes reveals around 1 SNP per kilobase. A sufficiently dense map of SNPs would allow the detection of sequence variants responsible for particular characteristics on the basis that they are associated with a specific SNP allele. Here we have evaluated large-scale sequencing approaches to obtaining SNPs, and have constructed a map of 2,730 SNPs on human chromosome 22. Most of the SNPs are within 25 kilobases of a transcribed exon, and are valuable for association studies. We have scaled up the process, detecting over 65,000 SNPs in the genome as part of The SNP Consortium programme, which is on target to build a map of 1 SNP every 5 kilobases that is integrated with the human genome sequence and that is freely available in the public domain.  相似文献   

6.
Here we present a finished sequence of human chromosome 15, together with a high-quality gene catalogue. As chromosome 15 is one of seven human chromosomes with a high rate of segmental duplication, we have carried out a detailed analysis of the duplication structure of the chromosome. Segmental duplications in chromosome 15 are largely clustered in two regions, on proximal and distal 15q; the proximal region is notable because recombination among the segmental duplications can result in deletions causing Prader-Willi and Angelman syndromes. Sequence analysis shows that the proximal and distal regions of 15q share extensive ancient similarity. Using a simple approach, we have been able to reconstruct many of the events by which the current duplication structure arose. We find that most of the intrachromosomal duplications seem to share a common ancestry. Finally, we demonstrate that some remaining gaps in the genome sequence are probably due to structural polymorphisms between haplotypes; this may explain a significant fraction of the gaps remaining in the human genome.  相似文献   

7.
The mosaic structure of variation in the laboratory mouse genome   总被引:56,自引:0,他引:56  
Most inbred laboratory mouse strains are known to have originated from a mixed but limited founder population in a few laboratories. However, the effect of this breeding history on patterns of genetic variation among these strains and the implications for their use are not well understood. Here we present an analysis of the fine structure of variation in the mouse genome, using single nucleotide polymorphisms (SNPs). When the recently assembled genome sequence from the C57BL/6J strain is aligned with sample sequence from other strains, we observe long segments of either extremely high (approximately 40 SNPs per 10 kb) or extremely low (approximately 0.5 SNPs per 10 kb) polymorphism rates. In all strain-to-strain comparisons examined, only one-third of the genome falls into long regions (averaging >1 Mb) of a high SNP rate, consistent with estimated divergence rates between Mus musculus domesticus and either M. m. musculus or M. m. castaneus. These data suggest that the genomes of these inbred strains are mosaics with the vast majority of segments derived from domesticus and musculus sources. These observations have important implications for the design and interpretation of positional cloning experiments.  相似文献   

8.
WITH THE SUCCESSFUL COMPLETION OF THE HUMAN GE- NOME PROJECT, ONE OF THE SCIENTIFIC MILESTONES, GENETIC VARIATIONS AND THEIR FUNCTIONAL IMPLICATIONS, HAVE BE- COME ONE OF THE FOCUSES IN GENOME RESEARCH. IT HAS BEEN KNOWN THAT GENETIC VARIATIONS, TOGETHER WITH ENVI- RONMENT, ARE RESPONSIBLE FOR THE DIFFERENCES IN COMPLEX TRAITS IN INDIVIDUALS: PHYSICAL CHARACTERISTICS, DISE…  相似文献   

9.
The abundance of single nucleotide polymorphisms (SNPs) makes the haplotype-based method instead of single-maker-oriented method the main approach to association studies on QTL mapping. The key problem in haploptype-based method is how to reconstruct haplotypes from genotype data. Directly assaying haplotypes in diploid individuals by experimental methods is too expensive, therefore the in silico haplotyping-determination methods are the major choice at the present. This paper presents a rapid and reliable algorithm for haplotype reconstruction for tightly linked SNPs in general pedigrees. It is based on six rules and consists of three steps. First, the parental origins of alleles in offspring are assigned conditional on genotypes in parent-offspring trios; second, the redundant haplotypes are eliminated based on the six rules; and finally, the most likely haplotype combinations are chosen via maximum likelihood method. Our method was verified and compared with PEDPHASE by simulated data with different pedigree sizes, numbers of loci, and proportions of missing genotypes. The result shows that our algorithm was superior to PEDPHASE in terms of computing time and accuracy of haplotype estimation. The computing time for 100 runs was 10―15 times less and the accuracy was 4%―10% higher than PEDPHASE. The result also indicates that our method was very robust and was hardly affected by pedigree size, number of loci, and proportion of missing genotypes.  相似文献   

10.
Genomic variation is the genetic basis of phenotypic diversity among individuals, including variation in disease susceptibility and drug response. The greatest promise of the International HapMap is to provide roadmaps for identifying genetic variants predisposing to complex diseases. Single nucleotide polymorphism (SNP) is the fundamental element of the HapMap. Allele frequency of SNPs is one of the major factors affecting the resulting HapMap, being the factor upon which linkage disequilibrium (LD) is calculated, haplotypes are constructed, and tagging SNPs (tagSNPs) are selected. The cutoff thresholds for the frequency of minor alleles used in the making of the map therefore have profound effects on the resolution of that map. To date most researchers have adopted their own cutoff thresh- olds, and there has been little real dataset-based evaluation of the effects of different cutoff thresholds on HapMap resolution. In an attempt to assess the implications of different cutoff values, we analyzed our own data for the centromeric genes on Chromosome 15 in Chinese Han and Tibetan populations, with respect to minor allele frequency cutoff values of 〉0.01 (0.01 group), 〉0.05 (0.05 group), and 〉0.10 (0.10 group), and constructed HapMaps from each of the datasets. The resolution, study power and cost-effectiveness for each of the maps were compared. Our results show that the 0.01 threshold provides the greatest power (P= 0.019 in Han and P= 0.029 in Tibetan for 0.01 vs. 0.05 threshold) and de- tects most population-specific haploypes (P= 0.012 for 0.01 vs. 0.05 threshold). However, in the regions studied, the 0.05 cutoff threshold did not significantly increase power above the 0.10 threshold (P = 0.191 in Han; 1.000 in Tibetans), and did not improve resolution over the 0.10 value for population- specific haplotypes (P= 0.592) neither. Furthermore the 0.05 and 0.10 values produced the same figures for tagging efficiency, LD block number, LD length, study power and cost-savings in the Tibetan population. These results suggest that a lower cutoff value is more appropriate for studies in which population-specific haplotypes are crucial, and that the most appropriate cutoff value may differ between populations. Due to the limited genes studied in this project more studies should be conducted to further address this important issue.  相似文献   

11.
Wang J  Wang W  Li R  Li Y  Tian G  Goodman L  Fan W  Zhang J  Li J  Zhang J  Guo Y  Feng B  Li H  Lu Y  Fang X  Liang H  Du Z  Li D  Zhao Y  Hu Y  Yang Z  Zheng H  Hellmann I  Inouye M  Pool J  Yi X  Zhao J  Duan J  Zhou Y  Qin J  Ma L  Li G  Yang Z  Zhang G  Yang B  Yu C  Liang F  Li W  Li S  Li D  Ni P  Ruan J  Li Q  Zhu H  Liu D  Lu Z  Li N  Guo G  Zhang J  Ye J  Fang L  Hao Q  Chen Q  Liang Y  Su Y  San A  Ping C  Yang S  Chen F  Li L  Zhou K  Zheng H  Ren Y  Yang L  Gao Y  Yang G  Li Z  Feng X  Kristiansen K  Wong GK  Nielsen R  Durbin R  Bolund L  Zhang X 《Nature》2008,456(7218):60-65
Here we present the first diploid genome sequence of an Asian individual. The genome was sequenced to 36-fold average coverage using massively parallel sequencing technology. We aligned the short reads onto the NCBI human reference genome to 99.97% coverage, and guided by the reference genome, we used uniquely mapped reads to assemble a high-quality consensus sequence for 92% of the Asian individual's genome. We identified approximately 3 million single-nucleotide polymorphisms (SNPs) inside this region, of which 13.6% were not in the dbSNP database. Genotyping analysis showed that SNP identification had high accuracy and consistency, indicating the high sequence quality of this assembly. We also carried out heterozygote phasing and haplotype prediction against HapMap CHB and JPT haplotypes (Chinese and Japanese, respectively), sequence comparison with the two available individual genomes (J. D. Watson and J. C. Venter), and structural variation identification. These variations were considered for their potential biological impact. Our sequence data and analyses demonstrate the potential usefulness of next-generation sequencing technologies for personal genomics.  相似文献   

12.
Recent advances in whole-genome sequencing have brought the vision of personal genomics and genomic medicine closer to reality. However, current methods lack clinical accuracy and the ability to describe the context (haplotypes) in which genome variants co-occur in a cost-effective manner. Here we describe a low-cost DNA sequencing and haplotyping process, long fragment read (LFR) technology, which is similar to sequencing long single DNA molecules without cloning or separation of metaphase chromosomes. In this study, ten LFR libraries were made using only ~100?picograms of human DNA per sample. Up to 97% of the heterozygous single nucleotide variants were assembled into long haplotype contigs. Removal of false positive single nucleotide variants not phased by multiple LFR haplotypes resulted in a final genome error rate of 1 in 10?megabases. Cost-effective and accurate genome sequencing and haplotyping from 10-20 human cells, as demonstrated here, will enable comprehensive genetic studies and diverse clinical applications.  相似文献   

13.
Most genomic variation is attributable to single nucleotide polymorphisms (SNPs), which therefore offer the highest resolution for tracking disease genes and population history. It has been proposed that a dense map of 30,000-500,000 SNPs can be used to scan the human genome for haplotypes associated with common diseases. Here we describe a simple but powerful method, called reduced representation shotgun (RRS) sequencing, for creating SNP maps. RRS re-samples specific subsets of the genome from several individuals, and compares the resulting sequences using a highly accurate SNP detection algorithm. The method can be extended by alignment to available genome sequence, increasing the yield of SNPs and providing map positions. These methods are being used by The SNP Consortium, an international collaboration of academic centres, pharmaceutical companies and a private foundation, to discover and release at least 300,000 human SNPs. We have discovered 47,172 human SNPs by RRS, and in total the Consortium has identified 148,459 SNPs. More broadly, RRS facilitates the rapid, inexpensive construction of SNP maps in biomedically and agriculturally important species. SNPs discovered by RRS also offer unique advantages for large-scale genotyping.  相似文献   

14.
依托GenBank数据库资源,分别对1号、X、Y染色体上CDS区的SNPs分布、类型和密度进行了初步分析。统计结果发现,AG(GA)和CT类型占优,分别各占30%以上;类型AC、AT、GT和CG统计频率在相同的数量级上,分别各占8%左右;并且这种分布频率不会随不同的染色体而变化。但是,SNPs在不同染色体上基因内的分布差异大。SNPs在Y染色体上的密度最小,在1号染色体上的密度最大。一般来说,SNPs的分布位置主要集中在内含子区,但是,在Y染色体上,外显子区的SNPs频数明显升高。  相似文献   

15.
LTR(Long terminal repeat)反转录转座子是真核生物基因组中普遍存在的一类遗传因子,它们以RNA为媒介在基因组中不断自我复制.在高等植物中,LTR反转录转座子是基因组的重要成分之一.本研究通过多种方法挖掘并注释了陆地棉基因组中的LTR反转录转座子,结果表明陆地棉基因组LTR反转录转座子的Gypsy超家族与基因的分布呈近似的反比关系,而Copia超家族在各染色体的起始端有较多的分布.通过皮尔森相关系数发现陆地棉LTR反转录转座子的拷贝数与染色体大小之间有强相关性.在LTR反转录转座子上游和下游分布的基因具有类似的富集特征,其分子功能主要集中在结合和催化活性等方面.本研究结果加深了对陆地棉LTR反转录转座子的认识,为深入研究棉花基因组提供了重要数据支撑.  相似文献   

16.
A first-generation linkage disequilibrium map of human chromosome 22   总被引:58,自引:0,他引:58  
DNA sequence variants in specific genes or regions of the human genome are responsible for a variety of phenotypes such as disease risk or variable drug response. These variants can be investigated directly, or through their non-random associations with neighbouring markers (called linkage disequilibrium (LD)). Here we report measurement of LD along the complete sequence of human chromosome 22. Duplicate genotyping and analysis of 1,504 markers in Centre d'Etude du Polymorphisme Humain (CEPH) reference families at a median spacing of 15 kilobases (kb) reveals a highly variable pattern of LD along the chromosome, in which extensive regions of nearly complete LD up to 804 kb in length are interspersed with regions of little or no detectable LD. The LD patterns are replicated in a panel of unrelated UK Caucasians. There is a strong correlation between high LD and low recombination frequency in the extant genetic map, suggesting that historical and contemporary recombination rates are similar. This study demonstrates the feasibility of developing genome-wide maps of LD.  相似文献   

17.
A physical map of the human Y chromosome   总被引:24,自引:0,他引:24  
The non-recombining region of the human Y chromosome (NRY), which comprises 95% of the chromosome, does not undergo sexual recombination and is present only in males. An understanding of its biological functions has begun to emerge from DNA studies of individuals with partial Y chromosomes, coupled with molecular characterization of genes implicated in gonadal sex reversal, Turner syndrome, graft rejection and spermatogenic failure. But mapping strategies applied successfully elsewhere in the genome have faltered in the NRY, where there is no meiotic recombination map and intrachromosomal repetitive sequences are abundant. Here we report a high-resolution physical map of the euchromatic, centromeric and heterochromatic regions of the NRY and its construction by unusual methods, including genomic clone subtraction and dissection of sequence family variants. Of the map's 758 DNA markers, 136 have multiple locations in the NRY, reflecting its unusually repetitive sequence composition. The markers anchor 1,038 bacterial artificial chromosome clones, 199 of which form a tiling path for sequencing.  相似文献   

18.
High-resolution mapping of meiotic crossovers and non-crossovers in yeast   总被引:1,自引:0,他引:1  
Mancera E  Bourgon R  Brozzi A  Huber W  Steinmetz LM 《Nature》2008,454(7203):479-485
Meiotic recombination has a central role in the evolution of sexually reproducing organisms. The two recombination outcomes, crossover and non-crossover, increase genetic diversity, but have the potential to homogenize alleles by gene conversion. Whereas crossover rates vary considerably across the genome, non-crossovers and gene conversions have only been identified in a handful of loci. To examine recombination genome wide and at high spatial resolution, we generated maps of crossovers, crossover-associated gene conversion and non-crossover gene conversion using dense genetic marker data collected from all four products of fifty-six yeast (Saccharomyces cerevisiae) meioses. Our maps reveal differences in the distributions of crossovers and non-crossovers, showing more regions where either crossovers or non-crossovers are favoured than expected by chance. Furthermore, we detect evidence for interference between crossovers and non-crossovers, a phenomenon previously only known to occur between crossovers. Up to 1% of the genome of each meiotic product is subject to gene conversion in a single meiosis, with detectable bias towards GC nucleotides. To our knowledge the maps represent the first high-resolution, genome-wide characterization of the multiple outcomes of recombination in any organism. In addition, because non-crossover hotspots create holes of reduced linkage within haplotype blocks, our results stress the need to incorporate non-crossovers into genetic linkage analysis.  相似文献   

19.
Chromosome 17 is unusual among the human chromosomes in many respects. It is the largest human autosome with orthology to only a single mouse chromosome, mapping entirely to the distal half of mouse chromosome 11. Chromosome 17 is rich in protein-coding genes, having the second highest gene density in the genome. It is also enriched in segmental duplications, ranking third in density among the autosomes. Here we report a finished sequence for human chromosome 17, as well as a structural comparison with the finished sequence for mouse chromosome 11, the first finished mouse chromosome. Comparison of the orthologous regions reveals striking differences. In contrast to the typical pattern seen in mammalian evolution, the human sequence has undergone extensive intrachromosomal rearrangement, whereas the mouse sequence has been remarkably stable. Moreover, although the human sequence has a high density of segmental duplication, the mouse sequence has a very low density. Notably, these segmental duplications correspond closely to the sites of structural rearrangement, demonstrating a link between duplication and rearrangement. Examination of the main classes of duplicated segments provides insight into the dynamics underlying expansion of chromosome-specific, low-copy repeats in the human genome.  相似文献   

20.
Here we report a high-quality draft genome sequence of the domestic dog (Canis familiaris), together with a dense map of single nucleotide polymorphisms (SNPs) across breeds. The dog is of particular interest because it provides important evolutionary information and because existing breeds show great phenotypic diversity for morphological, physiological and behavioural traits. We use sequence comparison with the primate and rodent lineages to shed light on the structure and evolution of genomes and genes. Notably, the majority of the most highly conserved non-coding sequences in mammalian genomes are clustered near a small subset of genes with important roles in development. Analysis of SNPs reveals long-range haplotypes across the entire dog genome, and defines the nature of genetic diversity within and across breeds. The current SNP map now makes it possible for genome-wide association studies to identify genes responsible for diseases and traits, with important consequences for human and companion animal health.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号