首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 250 毫秒
1.
We describe a map of 1.42 million single nucleotide polymorphisms (SNPs) distributed throughout the human genome, providing an average density on available sequence of one SNP every 1.9 kilobases. These SNPs were primarily discovered by two projects: The SNP Consortium and the analysis of clone overlaps by the International Human Genome Sequencing Consortium. The map integrates all publicly available SNPs with described genes and other genomic features. We estimate that 60,000 SNPs fall within exon (coding and untranslated regions), and 85% of exons are within 5 kb of the nearest SNP. Nucleotide diversity varies greatly across the genome, in a manner broadly consistent with a standard population genetic model of human history. This high-density SNP map provides a public resource for defining haplotype variation across the genome, and should help to identify biomedically important genes for diagnosis and therapy.  相似文献   

2.
Global variation in copy number in the human genome   总被引:3,自引:0,他引:3  
Copy number variation (CNV) of DNA sequences is functionally significant but has yet to be fully ascertained. We have constructed a first-generation CNV map of the human genome through the study of 270 individuals from four populations with ancestry in Europe, Africa or Asia (the HapMap collection). DNA from these individuals was screened for CNV using two complementary technologies: single-nucleotide polymorphism (SNP) genotyping arrays, and clone-based comparative genomic hybridization. A total of 1,447 copy number variable regions (CNVRs), which can encompass overlapping or adjacent gains or losses, covering 360 megabases (12% of the genome) were identified in these populations. These CNVRs contained hundreds of genes, disease loci, functional elements and segmental duplications. Notably, the CNVRs encompassed more nucleotide content per genome than SNPs, underscoring the importance of CNV in genetic diversity and evolution. The data obtained delineate linkage disequilibrium patterns for many CNVs, and reveal marked variation in copy number among populations. We also demonstrate the utility of this resource for genetic disease studies.  相似文献   

3.
Genome-wide patterns of variation across individuals provide a powerful source of data for uncovering the history of migration, range expansion, and adaptation of the human species. However, high-resolution surveys of variation in genotype, haplotype and copy number have generally focused on a small number of population groups. Here we report the analysis of high-quality genotypes at 525,910 single-nucleotide polymorphisms (SNPs) and 396 copy-number-variable loci in a worldwide sample of 29 populations. Analysis of SNP genotypes yields strongly supported fine-scale inferences about population structure. Increasing linkage disequilibrium is observed with increasing geographic distance from Africa, as expected under a serial founder effect for the out-of-Africa spread of human populations. New approaches for haplotype analysis produce inferences about population structure that complement results based on unphased SNPs. Despite a difference from SNPs in the frequency spectrum of the copy-number variants (CNVs) detected--including a comparatively large number of CNVs in previously unexamined populations from Oceania and the Americas--the global distribution of CNVs largely accords with population structure analyses for SNP data sets of similar size. Our results produce new inferences about inter-population variation, support the utility of CNVs in human population-genetic research, and serve as a genomic resource for human-genetic studies in diverse worldwide populations.  相似文献   

4.
Most genomic variation is attributable to single nucleotide polymorphisms (SNPs), which therefore offer the highest resolution for tracking disease genes and population history. It has been proposed that a dense map of 30,000-500,000 SNPs can be used to scan the human genome for haplotypes associated with common diseases. Here we describe a simple but powerful method, called reduced representation shotgun (RRS) sequencing, for creating SNP maps. RRS re-samples specific subsets of the genome from several individuals, and compares the resulting sequences using a highly accurate SNP detection algorithm. The method can be extended by alignment to available genome sequence, increasing the yield of SNPs and providing map positions. These methods are being used by The SNP Consortium, an international collaboration of academic centres, pharmaceutical companies and a private foundation, to discover and release at least 300,000 human SNPs. We have discovered 47,172 human SNPs by RRS, and in total the Consortium has identified 148,459 SNPs. More broadly, RRS facilitates the rapid, inexpensive construction of SNP maps in biomedically and agriculturally important species. SNPs discovered by RRS also offer unique advantages for large-scale genotyping.  相似文献   

5.
研究载脂蛋白E(Apolipoprotein E,APOE)和SHCP66(Src homology-domain cotaining protein,66ku)的基因在一个中国隔离人群中的单核苷酸多态性(single nucleotide polymorphism,SNP),以及它们与长寿性状的相关性,应用多聚酶链式反应结合测序技术对2个基因的外显子区域进行小样本(40个随机个体)的单碱基多态位点(SNP)检测,并对其中apoe的2个侯选位点在大样本中进行基因分型,结果在apoe基因中发现3个SNP位点,其中一个在调控区内的位点是首次报道,shcp66发现3个SNP位点和一个缺失突变,均为首次报告.apoe的2个候选位点在隔离人群中的长寿个体及对照组中基因频率的卡方测验显示无显著差别(P>5%),因此,在该长寿人群中,该基因的多态现象与长寿性状无显著的关联。  相似文献   

6.
摘要:目的 利用单核苷酸多态性位点对国家啮齿类实验动物种子中心3个封闭群小鼠群体进行群体遗传结构分析。方法 选取文献中45个SNP位点,采用基质辅助激光解吸电离飞行时间质谱(MALDI-TOF-MS)技术对来自国家啮齿类实验动物种子中心北京和上海分中心的ICR各1个群体,上海分中心的1个KM封闭群小鼠样本进行等位基因分型,分析群体遗传结构,并进行群体间遗传差异分析。结果 3个封闭群小鼠有效等位基因数(Ne)、观察杂合度(Ho)、期望杂合度(He)、平均杂合度(Ha)、多态信息含量(PIC)、香隆信息指数等遗传参数各不相同。同一群体的结果与微卫星DNA和生化位点的结果相比,SNP检测的参数值较低。但将3个群体中单态的SNP位点去除后参数值有所上升,尤其香隆指数值与STR检测结果接近。结论 所选SNP位点可用于封闭群小鼠遗传质量检测,所检测的3个群体遗传多样性用杂合度评价为SKM >BICR >SICR。  相似文献   

7.
An SNP map of human chromosome 22   总被引:35,自引:0,他引:35  
The human genome sequence will provide a reference for measuring DNA sequence variation in human populations. Sequence variants are responsible for the genetic component of individuality, including complex characteristics such as disease susceptibility and drug response. Most sequence variants are single nucleotide polymorphisms (SNPs), where two alternate bases occur at one position. Comparison of any two genomes reveals around 1 SNP per kilobase. A sufficiently dense map of SNPs would allow the detection of sequence variants responsible for particular characteristics on the basis that they are associated with a specific SNP allele. Here we have evaluated large-scale sequencing approaches to obtaining SNPs, and have constructed a map of 2,730 SNPs on human chromosome 22. Most of the SNPs are within 25 kilobases of a transcribed exon, and are valuable for association studies. We have scaled up the process, detecting over 65,000 SNPs in the genome as part of The SNP Consortium programme, which is on target to build a map of 1 SNP every 5 kilobases that is integrated with the human genome sequence and that is freely available in the public domain.  相似文献   

8.
Recent advances have shown that the majorityof the nucleotide variation in human genome is single nucleo-tide polymorphisms (SNPs). Using SNPs each chromosomecan be divided into different haplotype blocks, and there arelimited common haplotypes in each block. This provides apowerful approach for whole genome scan for disease-asso-ciated genes/variants. However, most data available todayare based on the large-scale genomic analyses, data concern-ing individual genes for fine mapping with high density SNPsare relatively lacking. We have sequenced 7 genes and theirflanking regions, identified 34 novel SNPs, constructed highdensity SNP haplotypes and haplotype blocks in 5 genes inthe centromeric region of chromosome 15 in I00 ChineseHart subjects. Our results show that there is a great hetero-geneity in the haplotypes and haplotype block structureswithin and between these genes, which are in close physicalproximity. Data obtained in this study provide a useful toolfor candidate gene approach at the fine scale for identifyingdisease contributing variants in the genes/regions.  相似文献   

9.
【目的】开发大量可靠的SNP标记,为鹅掌楸高密度遗传连锁图谱的构建和基于基因组的林木选择育种提供分子基础。【方法】从北美鹅掌楸NK基因型为母本、鹅掌楸LS基因型为父本的F1代杂交群体中,选取198株个体为作图群体。用限制性内切酶EcoR I对包括2个亲本和198个子代在内的200个单株的基因组DNA进行酶切,构建RAD(restriction-site associated DNA)文库并进行RAD-seq测序。采用读长为91 bp的双末端测序。2个亲本的平均测序深度为2×,198个子代的平均测序深度为0.8×,平均产量为1.94 Gb,共获得约387.21 Gb数据。用Stacks软件将每个样品的RAD-reads作生物信息学分析,对候选位点进行卡方检验和缺失率检验,再将符合孟德尔遗传的标记及与之相对应的RAD-tag序列和鹅掌楸参考基因组序列进行比对。最后,从本研究开发的SNP标记中选取27个候选SNP位点,设计引物,对随机挑选的16个F1代进行PCR扩增并将结果进行测序,同时验证SNP的有效性。【结果】从候选群体中共鉴定到22 019个SNP位点,符合孟德尔遗传规律的标记为4 233个,最终获得3 501个候选SNP标记。SNP验证中,共有293个SNP标记完成测序并能判读结果,所有位点都为SNP位点,共有194(66.2%)个SNP变异类型得到了验证。 【结论】基于RAD-seq技术和鹅掌楸参考基因组序列为基础的策略,能够作为一种快速有效的手段,实现大规模的分子标记开发,可用于鹅掌楸等林木的高密度遗传图谱的构建。  相似文献   

10.
WITH THE SUCCESSFUL COMPLETION OF THE HUMAN GE- NOME PROJECT, ONE OF THE SCIENTIFIC MILESTONES, GENETIC VARIATIONS AND THEIR FUNCTIONAL IMPLICATIONS, HAVE BE- COME ONE OF THE FOCUSES IN GENOME RESEARCH. IT HAS BEEN KNOWN THAT GENETIC VARIATIONS, TOGETHER WITH ENVI- RONMENT, ARE RESPONSIBLE FOR THE DIFFERENCES IN COMPLEX TRAITS IN INDIVIDUALS: PHYSICAL CHARACTERISTICS, DISE…  相似文献   

11.
Genomic variation is the genetic basis of phenotypic diversity among individuals, including variation in disease susceptibility and drug response. The greatest promise of the International HapMap is to provide roadmaps for identifying genetic variants predisposing to complex diseases. Single nucleotide polymorphism (SNP) is the fundamental element of the HapMap. Allele frequency of SNPs is one of the major factors affecting the resulting HapMap, being the factor upon which linkage disequilibrium (LD) is calculated, haplotypes are constructed, and tagging SNPs (tagSNPs) are selected. The cutoff thresholds for the frequency of minor alleles used in the making of the map therefore have profound effects on the resolution of that map. To date most researchers have adopted their own cutoff thresh- olds, and there has been little real dataset-based evaluation of the effects of different cutoff thresholds on HapMap resolution. In an attempt to assess the implications of different cutoff values, we analyzed our own data for the centromeric genes on Chromosome 15 in Chinese Han and Tibetan populations, with respect to minor allele frequency cutoff values of 〉0.01 (0.01 group), 〉0.05 (0.05 group), and 〉0.10 (0.10 group), and constructed HapMaps from each of the datasets. The resolution, study power and cost-effectiveness for each of the maps were compared. Our results show that the 0.01 threshold provides the greatest power (P= 0.019 in Han and P= 0.029 in Tibetan for 0.01 vs. 0.05 threshold) and de- tects most population-specific haploypes (P= 0.012 for 0.01 vs. 0.05 threshold). However, in the regions studied, the 0.05 cutoff threshold did not significantly increase power above the 0.10 threshold (P = 0.191 in Han; 1.000 in Tibetans), and did not improve resolution over the 0.10 value for population- specific haplotypes (P= 0.592) neither. Furthermore the 0.05 and 0.10 values produced the same figures for tagging efficiency, LD block number, LD length, study power and cost-savings in the Tibetan population. These results suggest that a lower cutoff value is more appropriate for studies in which population-specific haplotypes are crucial, and that the most appropriate cutoff value may differ between populations. Due to the limited genes studied in this project more studies should be conducted to further address this important issue.  相似文献   

12.
Breast cancer exhibits familial aggregation, consistent with variation in genetic susceptibility to the disease. Known susceptibility genes account for less than 25% of the familial risk of breast cancer, and the residual genetic variance is likely to be due to variants conferring more moderate risks. To identify further susceptibility alleles, we conducted a two-stage genome-wide association study in 4,398 breast cancer cases and 4,316 controls, followed by a third stage in which 30 single nucleotide polymorphisms (SNPs) were tested for confirmation in 21,860 cases and 22,578 controls from 22 studies. We used 227,876 SNPs that were estimated to correlate with 77% of known common SNPs in Europeans at r2 > 0.5. SNPs in five novel independent loci exhibited strong and consistent evidence of association with breast cancer (P < 10(-7)). Four of these contain plausible causative genes (FGFR2, TNRC9, MAP3K1 and LSP1). At the second stage, 1,792 SNPs were significant at the P < 0.05 level compared with an estimated 1,343 that would be expected by chance, indicating that many additional common susceptibility alleles may be identifiable by this approach.  相似文献   

13.
Quantifying the number of deleterious mutations per diploid human genome is of crucial concern to both evolutionary and medical geneticists. Here we combine genome-wide polymorphism data from PCR-based exon resequencing, comparative genomic data across mammalian species, and protein structure predictions to estimate the number of functionally consequential single-nucleotide polymorphisms (SNPs) carried by each of 15 African American (AA) and 20 European American (EA) individuals. We find that AAs show significantly higher levels of nucleotide heterozygosity than do EAs for all categories of functional SNPs considered, including synonymous, non-synonymous, predicted 'benign', predicted 'possibly damaging' and predicted 'probably damaging' SNPs. This result is wholly consistent with previous work showing higher overall levels of nucleotide variation in African populations than in Europeans. EA individuals, in contrast, have significantly more genotypes homozygous for the derived allele at synonymous and non-synonymous SNPs and for the damaging allele at 'probably damaging' SNPs than AAs do. For SNPs segregating only in one population or the other, the proportion of non-synonymous SNPs is significantly higher in the EA sample (55.4%) than in the AA sample (47.0%; P < 2.3 x 10(-37)). We observe a similar proportional excess of SNPs that are inferred to be 'probably damaging' (15.9% in EA; 12.1% in AA; P < 3.3 x 10(-11)). Using extensive simulations, we show that this excess proportion of segregating damaging alleles in Europeans is probably a consequence of a bottleneck that Europeans experienced at about the time of the migration out of Africa.  相似文献   

14.
Understanding the determinants of healthy mental ageing is a priority for society today. So far, we know that intelligence differences show high stability from childhood to old age and there are estimates of the genetic contribution to intelligence at different ages. However, attempts to discover whether genetic causes contribute to differences in cognitive ageing have been relatively uninformative. Here we provide an estimate of the genetic and environmental contributions to stability and change in intelligence across most of the human lifetime. We used genome-wide single nucleotide polymorphism (SNP) data from 1,940 unrelated individuals whose intelligence was measured in childhood (age 11 years) and again in old age (age 65, 70 or 79 years). We use a statistical method that allows genetic (co)variance to be estimated from SNP data on unrelated individuals. We estimate that causal genetic variants in linkage disequilibrium with common SNPs account for 0.24 of the variation in cognitive ability change from childhood to old age. Using bivariate analysis, we estimate a genetic correlation between intelligence at age 11 years and in old age of 0.62. These estimates, derived from rarely available data on lifetime cognitive measures, warrant the search for genetic causes of cognitive stability and change.  相似文献   

15.
The medaka draft genome and insights into vertebrate genome evolution   总被引:3,自引:0,他引:3  
Teleosts comprise more than half of all vertebrate species and have adapted to a variety of marine and freshwater habitats. Their genome evolution and diversification are important subjects for the understanding of vertebrate evolution. Although draft genome sequences of two pufferfishes have been published, analysis of more fish genomes is desirable. Here we report a high-quality draft genome sequence of a small egg-laying freshwater teleost, medaka (Oryzias latipes). Medaka is native to East Asia and an excellent model system for a wide range of biology, including ecotoxicology, carcinogenesis, sex determination and developmental genetics. In the assembled medaka genome (700 megabases), which is less than half of the zebrafish genome, we predicted 20,141 genes, including approximately 2,900 new genes, using 5'-end serial analysis of gene expression tag information. We found single nucleotide polymorphisms (SNPs) at an average rate of 3.42% between the two inbred strains derived from two regional populations; this is the highest SNP rate seen in any vertebrate species. Analyses based on the dense SNP information show a strict genetic separation of 4 million years (Myr) between the two populations, and suggest that differential selective pressures acted on specific gene categories. Four-way comparisons with the human, pufferfish (Tetraodon), zebrafish and medaka genomes revealed that eight major interchromosomal rearrangements took place in a remarkably short period of approximately 50 Myr after the whole-genome duplication event in the teleost ancestor and afterwards, intriguingly, the medaka genome preserved its ancestral karyotype for more than 300 Myr.  相似文献   

16.
【目的】探讨我国天竺桂资源的遗传多样性及群体的空间分布格局。【方法】分别对来自7个天然群体的30份天竺桂资源进行特异位点扩增片段测序(specific-locus amplified fragment sequencing, SLAF-seq)。基于检测到的SNP位点信息进行遗传变异分析。【结果】各样品的平均测序深度为15.11倍,开发获得 1 296 000个SLAF标签,其中377 250个SLAF标签在不同样品间具有多态性,共包含3 409 402个群体SNP,经过滤,最终获得268 821个高度一致性的群体SNP。基于这些SNP的系统进化分析发现,天竺桂是由中国东部向中国西部逐渐进化的。系统进化、主成分分析和群体结构分析均表明,来自5省(直辖市)7个天然群体的群体内变异小于群体间变异。30份天竺桂资源可分为2个大的亚群,其中,第1亚群位于中国第2阶梯上,第2亚群位于中国第3阶梯上,2个亚群间被大兴安岭—太行山脉—巫山—雪峰山山脉阻隔。第2亚群又可进一步分为3个小亚群,其中,来自浙江省和安徽桃岭的为第1个小亚群,来自安徽霍山的为第2个小亚群,而来自河南伏牛山的为第3个小亚群,第1个和第2个小亚群间被长江阻隔。【结论】山脉和湖泊的阻隔可能是造成天竺桂遗传分化的重要因素。本研究首次揭示天竺桂遗传结构及地理变化规律,为我国天竺桂资源的有效利用与科学保护提供理论依据。  相似文献   

17.
A map of human genome variation from population-scale sequencing   总被引:2,自引:0,他引:2  
Genomes Project Consortium 《Nature》2010,467(7319):1061-1073
The 1000 Genomes Project aims to provide a deep characterization of human genome sequence variation as a foundation for investigating the relationship between genotype and phenotype. Here we present results of the pilot phase of the project, designed to develop and compare different strategies for genome-wide sequencing with high-throughput platforms. We undertook three projects: low-coverage whole-genome sequencing of 179 individuals from four populations; high-coverage sequencing of two mother-father-child trios; and exon-targeted sequencing of 697 individuals from seven populations. We describe the location, allele frequency and local haplotype structure of approximately 15 million single nucleotide polymorphisms, 1 million short insertions and deletions, and 20,000 structural variants, most of which were previously undescribed. We show that, because we have catalogued the vast majority of common variation, over 95% of the currently accessible variants found in any individual are present in this data set. On average, each person is found to carry approximately 250 to 300 loss-of-function variants in annotated genes and 50 to 100 variants previously implicated in inherited disorders. We demonstrate how these results can be used to inform association and functional studies. From the two trios, we directly estimate the rate of de novo germline base substitution mutations to be approximately 10(-8) per base pair per generation. We explore the data with regard to signatures of natural selection, and identify a marked reduction of genetic variation in the neighbourhood of genes, due to selection at linked sites. These methods and public data will support the next phase of human genetic research.  相似文献   

18.
Keller A  Zhuang H  Chi Q  Vosshall LB  Matsunami H 《Nature》2007,449(7161):468-472
Human olfactory perception differs enormously between individuals, with large reported perceptual variations in the intensity and pleasantness of a given odour. For instance, androstenone (5alpha-androst-16-en-3-one), an odorous steroid derived from testosterone, is variously perceived by different individuals as offensive ("sweaty, urinous"), pleasant ("sweet, floral") or odourless. Similar variation in odour perception has been observed for several other odours. The mechanistic basis of variation in odour perception between individuals is unknown. We investigated whether genetic variation in human odorant receptor genes accounts in part for variation in odour perception between individuals. Here we show that a human odorant receptor, OR7D4, is selectively activated in vitro by androstenone and the related odorous steroid androstadienone (androsta-4,16-dien-3-one) and does not respond to a panel of 64 other odours and two solvents. A common variant of this receptor (OR7D4 WM) contains two non-synonymous single nucleotide polymorphisms (SNPs), resulting in two amino acid substitutions (R88W, T133M; hence 'RT') that severely impair function in vitro. Human subjects with RT/WM or WM/WM genotypes as a group were less sensitive to androstenone and androstadienone and found both odours less unpleasant than the RT/RT group. Genotypic variation in OR7D4 accounts for a significant proportion of the valence (pleasantness or unpleasantness) and intensity variance in perception of these steroidal odours. Our results demonstrate the first link between the function of a human odorant receptor in vitro and odour perception.  相似文献   

19.
Wang J  Wang W  Li R  Li Y  Tian G  Goodman L  Fan W  Zhang J  Li J  Zhang J  Guo Y  Feng B  Li H  Lu Y  Fang X  Liang H  Du Z  Li D  Zhao Y  Hu Y  Yang Z  Zheng H  Hellmann I  Inouye M  Pool J  Yi X  Zhao J  Duan J  Zhou Y  Qin J  Ma L  Li G  Yang Z  Zhang G  Yang B  Yu C  Liang F  Li W  Li S  Li D  Ni P  Ruan J  Li Q  Zhu H  Liu D  Lu Z  Li N  Guo G  Zhang J  Ye J  Fang L  Hao Q  Chen Q  Liang Y  Su Y  San A  Ping C  Yang S  Chen F  Li L  Zhou K  Zheng H  Ren Y  Yang L  Gao Y  Yang G  Li Z  Feng X  Kristiansen K  Wong GK  Nielsen R  Durbin R  Bolund L  Zhang X 《Nature》2008,456(7218):60-65
Here we present the first diploid genome sequence of an Asian individual. The genome was sequenced to 36-fold average coverage using massively parallel sequencing technology. We aligned the short reads onto the NCBI human reference genome to 99.97% coverage, and guided by the reference genome, we used uniquely mapped reads to assemble a high-quality consensus sequence for 92% of the Asian individual's genome. We identified approximately 3 million single-nucleotide polymorphisms (SNPs) inside this region, of which 13.6% were not in the dbSNP database. Genotyping analysis showed that SNP identification had high accuracy and consistency, indicating the high sequence quality of this assembly. We also carried out heterozygote phasing and haplotype prediction against HapMap CHB and JPT haplotypes (Chinese and Japanese, respectively), sequence comparison with the two available individual genomes (J. D. Watson and J. C. Venter), and structural variation identification. These variations were considered for their potential biological impact. Our sequence data and analyses demonstrate the potential usefulness of next-generation sequencing technologies for personal genomics.  相似文献   

20.
A dense map of genetic variation in the laboratory mouse genome will provide insights into the evolutionary history of the species and lead to an improved understanding of the relationship between inter-strain genotypic and phenotypic differences. Here we resequence the genomes of four wild-derived and eleven classical strains. We identify 8.27 million high-quality single nucleotide polymorphisms (SNPs) densely distributed across the genome, and determine the locations of the high (divergent subspecies ancestry) and low (common subspecies ancestry) SNP-rate intervals for every pairwise combination of classical strains. Using these data, we generate a genome-wide haplotype map containing 40,898 segments, each with an average of three distinct ancestral haplotypes. For the haplotypes in the classical strains that are unequivocally assigned ancestry, the genetic contributions of the Mus musculus subspecies--M. m. domesticus, M. m. musculus, M. m. castaneus and the hybrid M. m. molossinus--are 68%, 6%, 3% and 10%, respectively; the remaining 13% of haplotypes are of unknown ancestral origin. The considerable regional redundancy of the SNP data will facilitate imputation of the majority of these genotypes in less-densely typed classical inbred strains to provide a complete view of variation in additional strains.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号