首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We have placed 7,600 cytogenetically defined landmarks on the draft sequence of the human genome to help with the characterization of genes altered by gross chromosomal aberrations that cause human disease. The landmarks are large-insert clones mapped to chromosome bands by fluorescence in situ hybridization. Each clone contains a sequence tag that is positioned on the genomic sequence. This genome-wide set of sequence-anchored clones allows structural and functional analyses of the genome. This resource represents the first comprehensive integration of cytogenetic, radiation hybrid, linkage and sequence maps of the human genome; provides an independent validation of the sequence map and framework for contig order and orientation; surveys the genome for large-scale duplications, which are likely to require special attention during sequence assembly; and allows a stringent assessment of sequence differences between the dark and light bands of chromosomes. It also provides insight into large-scale chromatin structure and the evolution of chromosomes and gene families and will accelerate our understanding of the molecular bases of human disease and cancer.  相似文献   

2.
A physical map of the mouse genome   总被引:1,自引:0,他引:1  
A physical map of a genome is an essential guide for navigation, allowing the location of any gene or other landmark in the chromosomal DNA. We have constructed a physical map of the mouse genome that contains 296 contigs of overlapping bacterial clones and 16,992 unique markers. The mouse contigs were aligned to the human genome sequence on the basis of 51,486 homology matches, thus enabling use of the conserved synteny (correspondence between chromosome blocks) of the two genomes to accelerate construction of the mouse map. The map provides a framework for assembly of whole-genome shotgun sequence data, and a tile path of clones for generation of the reference sequence. Definition of the human-mouse alignment at this level of resolution enables identification of a mouse clone that corresponds to almost any position in the human genome. The human sequence may be used to facilitate construction of other mammalian genome maps using the same strategy.  相似文献   

3.
Strategies for assembling large, complex genomes have evolved to include a combination of whole-genome shotgun sequencing and hierarchal map-assisted sequencing. Whole-genome maps of all types can aid genome assemblies, generally starting with low-resolution cytogenetic maps and ending with the highest resolution of sequence. Fingerprint clone maps are based upon complete restriction enzyme digests of clones representative of the target genome, and ultimately comprise a near-contiguous path of clones across the genome. Such clone-based maps are used to validate sequence assembly order, supply long-range linking information for assembled sequences, anchor sequences to the genetic map and provide templates for closing gaps. Fingerprint maps are also a critical resource for subsequent functional genomic studies, because they provide a redundant and ordered sampling of the genome with clones. In an accompanying paper we describe the draft genome sequence of the chicken, Gallus gallus, the first species sequenced that is both a model organism and a global food source. Here we present a clone-based physical map of the chicken genome at 20-fold coverage, containing 260 contigs of overlapping clones. This map represents approximately 91% of the chicken genome and enables identification of chicken clones aligned to positions in other sequenced genomes.  相似文献   

4.
She X  Jiang Z  Clark RA  Liu G  Cheng Z  Tuzun E  Church DM  Sutton G  Halpern AL  Eichler EE 《Nature》2004,431(7011):927-930
Complex eukaryotic genomes are now being sequenced at an accelerated pace primarily using whole-genome shotgun (WGS) sequence assembly approaches. WGS assembly was initially criticized because of its perceived inability to resolve repeat structures within genomes. Here, we quantify the effect of WGS sequence assembly on large, highly similar repeats by comparison of the segmental duplication content of two different human genome assemblies. Our analysis shows that large (> 15 kilobases) and highly identical (> 97%) duplications are not adequately resolved by WGS assembly. This leads to significant reduction in genome length and the loss of genes embedded within duplications. Comparable analyses of mouse genome assemblies confirm that strict WGS sequence assembly will oversimplify our understanding of mammalian genome structure and evolution; a hybrid strategy using a targeted clone-by-clone approach to resolve duplications is proposed.  相似文献   

5.
An SNP map of human chromosome 22   总被引:35,自引:0,他引:35  
The human genome sequence will provide a reference for measuring DNA sequence variation in human populations. Sequence variants are responsible for the genetic component of individuality, including complex characteristics such as disease susceptibility and drug response. Most sequence variants are single nucleotide polymorphisms (SNPs), where two alternate bases occur at one position. Comparison of any two genomes reveals around 1 SNP per kilobase. A sufficiently dense map of SNPs would allow the detection of sequence variants responsible for particular characteristics on the basis that they are associated with a specific SNP allele. Here we have evaluated large-scale sequencing approaches to obtaining SNPs, and have constructed a map of 2,730 SNPs on human chromosome 22. Most of the SNPs are within 25 kilobases of a transcribed exon, and are valuable for association studies. We have scaled up the process, detecting over 65,000 SNPs in the genome as part of The SNP Consortium programme, which is on target to build a map of 1 SNP every 5 kilobases that is integrated with the human genome sequence and that is freely available in the public domain.  相似文献   

6.
The sequence of the rice genome holds fundamental information for its biology, including physiology, genetics, development, and evolution, as well as information on many beneficial phenotypes of economic significance. Using a “whole genome shotgun” approach, we have produced a draft rice genome sequence ofOryza sativa ssp.indica, the major crop rice subspecies in China and many other regions of Asia. The draft genome sequence is constructed from over 4.3 million successful sequencing traces with an accumulative total length of 2214.9 Mb. The initial assembly of the non-redundant sequences reached 409.76 Mb in length, based on 3.30 million successful sequencing traces with a total length of 1797.4 Mb from anindica variant cultivar93-11, giving an estimated coverage of 95.29% of the rice genome with an average base accuracy of higher than 99%. The coverage of the draft sequence, the randomness of the sequence distribution, and the consistency of BIG-ASSEMBLER, a custom-designed software package used for the initial assembly, were verified rigorously by comparisons against finished BAC clone sequences from bothindica andjapanica strains, available from the public databases. Over all, 96.3% of full-length cDNAs, 96.4% of STS, STR, RFLP markers, 94.0% of ESTs and 94.9% unigene clusters were identified from the draft sequence. Our preliminary analysis on the data set shows that our rice draft sequence is consistent with the comman standard accepted by the genome sequencing community. The unconditional release of the draft to the public also undoubtedly provides a fundamental resource to the international scientific communities to facilitate genomic and genetic studies on rice biology. These authors contributed equally to this work.  相似文献   

7.
The genome sequence and structure of rice chromosome 1   总被引:2,自引:0,他引:2  
The rice species Oryza sativa is considered to be a model plant because of its small genome size, extensive genetic map, relative ease of transformation and synteny with other cereal crops. Here we report the essentially complete sequence of chromosome 1, the longest chromosome in the rice genome. We summarize characteristics of the chromosome structure and the biological insight gained from the sequence. The analysis of 43.3 megabases (Mb) of non-overlapping sequence reveals 6,756 protein coding genes, of which 3,161 show homology to proteins of Arabidopsis thaliana, another model plant. About 30% (2,073) of the genes have been functionally categorized. Rice chromosome 1 is (G + C)-rich, especially in its coding regions, and is characterized by several gene families that are dispersed or arranged in tandem repeats. Comparison with a draft sequence indicates the importance of a high-quality finished sequence.  相似文献   

8.
全基因组测序技术研究及其在木本植物中的应用   总被引:2,自引:0,他引:2  
基因组序列是开展遗传研究重要的信息基础,随着测序技术飞速发展至第3代长片段测序方法,测序读长历经从几十到数万个碱基的提升,对进一步提升基因组组装的完整度以及准确性提供了极大的裨益。现已完成了大量植物种全基因组测序工作,其中木本植物有40多个,还有更多树种的全基因组测序正在进行之中。针对各类测序技术的基因组组装及后续分析,研究人员也开发了大量的生物信息学工具。笔者从测序技术、基因组装技术和全基因组测序生物信息学分析等方面,罗列了目前已完成全基因组测序的木本植物,介绍了全基因组测序技术的发展与应用,以及适用于第3代数据基因组组装的生物学分析软件,为林木基因组研究者提供一定的借鉴。  相似文献   

9.
Since the sequencing of the first two chromosomes of the malaria parasite, Plasmodium falciparum, there has been a concerted effort to sequence and assemble the entire genome of this organism. Here we report the sequence of chromosomes 1, 3-9 and 13 of P. falciparum clone 3D7--these chromosomes account for approximately 55% of the total genome. We describe the methods used to map, sequence and annotate these chromosomes. By comparing our assemblies with the optical map, we indicate the completeness of the resulting sequence. During annotation, we assign Gene Ontology terms to the predicted gene products, and observe clustering of some malaria-specific terms to specific chromosomes. We identify a highly conserved sequence element found in the intergenic region of internal var genes that is not associated with their telomeric counterparts.  相似文献   

10.
Hassan F  Kamruzzaman M  Mekalanos JJ  Faruque SM 《Nature》2010,467(7318):982-985
Bacterial chromosomes often carry integrated genetic elements (for example plasmids, transposons, prophages and islands) whose precise function and contribution to the evolutionary fitness of the host bacterium are unknown. The CTXφ prophage, which encodes cholera toxin in Vibrio cholerae, is known to be adjacent to a chromosomally integrated element of unknown function termed the toxin-linked cryptic (TLC). Here we report the characterization of a TLC-related element that corresponds to the genome of a satellite filamentous phage (TLC-Knφ1), which uses the morphogenesis genes of another filamentous phage (fs2φ) to form infectious TLC-Knφ1 phage particles. The TLC-Knφ1 phage genome carries a sequence similar to the dif recombination sequence, which functions in chromosome dimer resolution using XerC and XerD recombinases. The dif sequence is also exploited by lysogenic filamentous phages (for example CTXφ) for chromosomal integration of their genomes. Bacterial cells defective in the dimer resolution often show an aberrant filamentous cell morphology. We found that acquisition and chromosomal integration of the TLC-Knφ1 genome restored a perfect dif site and normal morphology to V.?cholerae wild-type and mutant strains with dif(-) filamentation phenotypes. Furthermore, lysogeny of a dif(-) non-toxigenic V.?cholerae with TLC-Knφ1 promoted its subsequent toxigenic conversion through integration of CTXφ into the restored dif site. These results reveal a remarkable level of cooperative interactions between multiple filamentous phages in the emergence of the bacterial pathogen that causes cholera.  相似文献   

11.
The tomato genome sequence provides insights into fleshy fruit evolution   总被引:12,自引:0,他引:12  
Tomato Genome Consortium 《Nature》2012,485(7400):635-641
Tomato (Solanum lycopersicum) is a major crop plant and a model system for fruit development. Solanum is one of the largest angiosperm genera and includes annual and perennial plants from diverse habitats. Here we present a high-quality genome sequence of domesticated tomato, a draft sequence of its closest wild relative, Solanum pimpinellifolium, and compare them to each other and to the potato genome (Solanum tuberosum). The two tomato genomes show only 0.6% nucleotide divergence and signs of recent admixture, but show more than 8% divergence from potato, with nine large and several smaller inversions. In contrast to Arabidopsis, but similar to soybean, tomato and potato small RNAs map predominantly to gene-rich chromosomal regions, including gene promoters. The Solanum lineage has experienced two consecutive genome triplications: one that is ancient and shared with rosids, and a more recent one. These triplications set the stage for the neofunctionalization of genes controlling fruit characteristics, such as colour and fleshiness.  相似文献   

12.
A high-resolution map of active promoters in the human genome   总被引:1,自引:0,他引:1  
Kim TH  Barrera LO  Zheng M  Qu C  Singer MA  Richmond TA  Wu Y  Green RD  Ren B 《Nature》2005,436(7052):876-880
  相似文献   

13.
A physical map of the human Y chromosome   总被引:24,自引:0,他引:24  
The non-recombining region of the human Y chromosome (NRY), which comprises 95% of the chromosome, does not undergo sexual recombination and is present only in males. An understanding of its biological functions has begun to emerge from DNA studies of individuals with partial Y chromosomes, coupled with molecular characterization of genes implicated in gonadal sex reversal, Turner syndrome, graft rejection and spermatogenic failure. But mapping strategies applied successfully elsewhere in the genome have faltered in the NRY, where there is no meiotic recombination map and intrachromosomal repetitive sequences are abundant. Here we report a high-resolution physical map of the euchromatic, centromeric and heterochromatic regions of the NRY and its construction by unusual methods, including genomic clone subtraction and dissection of sequence family variants. Of the map's 758 DNA markers, 136 have multiple locations in the NRY, reflecting its unusually repetitive sequence composition. The markers anchor 1,038 bacterial artificial chromosome clones, 199 of which form a tiling path for sequencing.  相似文献   

14.
Wang J  Wang W  Li R  Li Y  Tian G  Goodman L  Fan W  Zhang J  Li J  Zhang J  Guo Y  Feng B  Li H  Lu Y  Fang X  Liang H  Du Z  Li D  Zhao Y  Hu Y  Yang Z  Zheng H  Hellmann I  Inouye M  Pool J  Yi X  Zhao J  Duan J  Zhou Y  Qin J  Ma L  Li G  Yang Z  Zhang G  Yang B  Yu C  Liang F  Li W  Li S  Li D  Ni P  Ruan J  Li Q  Zhu H  Liu D  Lu Z  Li N  Guo G  Zhang J  Ye J  Fang L  Hao Q  Chen Q  Liang Y  Su Y  San A  Ping C  Yang S  Chen F  Li L  Zhou K  Zheng H  Ren Y  Yang L  Gao Y  Yang G  Li Z  Feng X  Kristiansen K  Wong GK  Nielsen R  Durbin R  Bolund L  Zhang X 《Nature》2008,456(7218):60-65
Here we present the first diploid genome sequence of an Asian individual. The genome was sequenced to 36-fold average coverage using massively parallel sequencing technology. We aligned the short reads onto the NCBI human reference genome to 99.97% coverage, and guided by the reference genome, we used uniquely mapped reads to assemble a high-quality consensus sequence for 92% of the Asian individual's genome. We identified approximately 3 million single-nucleotide polymorphisms (SNPs) inside this region, of which 13.6% were not in the dbSNP database. Genotyping analysis showed that SNP identification had high accuracy and consistency, indicating the high sequence quality of this assembly. We also carried out heterozygote phasing and haplotype prediction against HapMap CHB and JPT haplotypes (Chinese and Japanese, respectively), sequence comparison with the two available individual genomes (J. D. Watson and J. C. Venter), and structural variation identification. These variations were considered for their potential biological impact. Our sequence data and analyses demonstrate the potential usefulness of next-generation sequencing technologies for personal genomics.  相似文献   

15.
We describe a map of 1.42 million single nucleotide polymorphisms (SNPs) distributed throughout the human genome, providing an average density on available sequence of one SNP every 1.9 kilobases. These SNPs were primarily discovered by two projects: The SNP Consortium and the analysis of clone overlaps by the International Human Genome Sequencing Consortium. The map integrates all publicly available SNPs with described genes and other genomic features. We estimate that 60,000 SNPs fall within exon (coding and untranslated regions), and 85% of exons are within 5 kb of the nearest SNP. Nucleotide diversity varies greatly across the genome, in a manner broadly consistent with a standard population genetic model of human history. This high-density SNP map provides a public resource for defining haplotype variation across the genome, and should help to identify biomedically important genes for diagnosis and therapy.  相似文献   

16.
Here we report a high-quality draft genome sequence of the domestic dog (Canis familiaris), together with a dense map of single nucleotide polymorphisms (SNPs) across breeds. The dog is of particular interest because it provides important evolutionary information and because existing breeds show great phenotypic diversity for morphological, physiological and behavioural traits. We use sequence comparison with the primate and rodent lineages to shed light on the structure and evolution of genomes and genes. Notably, the majority of the most highly conserved non-coding sequences in mammalian genomes are clustered near a small subset of genes with important roles in development. Analysis of SNPs reveals long-range haplotypes across the entire dog genome, and defines the nature of genetic diversity within and across breeds. The current SNP map now makes it possible for genome-wide association studies to identify genes responsible for diseases and traits, with important consequences for human and companion animal health.  相似文献   

17.
在已有测序数据基础上,利用三种常见的序列组装软件对Paenibacillus Shenyangensis全基因组测序结果进行拼接组装,分析比较了不同软件在各自最优参数条件下DNA序列的组装数据,并与NCBI数据库中类芽孢杆菌属其他近缘种进行基因比对与预测.结果表明,SOAPdenovo的组装结果最优,在k-mer为23时,组装基因组总长和N50分别为5 501 467和293 864 bp,预测的4 800个基因中有4 393个与NCBI-Nr数据库比对并注释成功.  相似文献   

18.
Hyman RW  Fung E  Conway A  Kurdi O  Mao J  Miranda M  Nakao B  Rowley D  Tamaki T  Wang F  Davis RW 《Nature》2002,419(6906):534-537
The human malaria parasite Plasmodium falciparum is responsible for the death of more than a million people every year. To stimulate basic research on the disease, and to promote the development of effective drugs and vaccines against the parasite, the complete genome of P. falciparum clone 3D7 has been sequenced, using a chromosome-by-chromosome shotgun strategy. Here we report the nucleotide sequence of the third largest of the parasite's 14 chromosomes, chromosome 12, which comprises about 10% of the 23-megabase genome. As the most (A + T)-rich (80.6%) genome sequenced to date, the P. falciparum genome presented severe problems during the assembly of primary sequence reads. We discuss the methodology that yielded a finished and fully contiguous sequence for chromosome 12. The biological implications of the sequence data are more thoroughly discussed in an accompanying Article (ref. 3).  相似文献   

19.
The International HapMap Project   总被引:1,自引:0,他引:1  
The goal of the International HapMap Project is to determine the common patterns of DNA sequence variation in the human genome and to make this information freely available in the public domain. An international consortium is developing a map of these patterns across the genome by determining the genotypes of one million or more sequence variants, their frequencies and the degree of association between them, in DNA samples from populations with ancestry from parts of Africa, Asia and Europe. The HapMap will allow the discovery of sequence variants that affect common disease, will facilitate development of diagnostic tools, and will enhance our ability to choose targets for therapeutic intervention.  相似文献   

20.
We have cloned the replicative form of thePeriplaneta fuliginosa densonucleosis virus (PfDNV) genome and determined its complete sequence. The sequence has 5 454 nucleotides (nt), the genome consists of an internal unique sequence flanked by inverted terminal repeats (201 nt). The first 122 nt at the 5′ end and the terminal 122 nt at the 3′ end of both plus and minus strands can fold into a typical hairpin structure. The genome contains seven major open reading frames (ORFs). The plus strand has 4 ORFs occupying the 5′ half of the plus strand, whereas the others span the 5′ half of the minus strand. Two potential promoters were found at map units (m.u.) 3 and 97. Computer analysis of sequence homologies with other parvoviruses suggests that the plus strand ofPf DNV encodes very likely the nonstructural proteins and the minus strand probably encodes the structural proteins.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号