首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 14 毫秒
1.
A complete BAC-based physical map of the Arabidopsis thaliana genome.   总被引:11,自引:0,他引:11  
Arabidopsis thaliana is a small flowering plant that serves as the major model system in plant molecular genetics. The efforts of many scientists have produced genetic maps that provide extensive coverage of the genome (http://genome-www. stanford.edu/Arabidopsis/maps.html). Recently, detailed YAC, BAC, P1 and cosmid-based physical maps (that is, representations of genomic regions as sets of overlapping clones of corresponding libraries) have been established that extend over wide genomic areas ranging from several hundreds of kilobases to entire chromosomes. These maps provide an entry to gain deeper insight into the A. thaliana genome structure. A. thaliana has been chosen as the subject of the first large-scale project intended to determine the full genome sequence of a plant. This sequencing project, together with the increasing interest in map-based gene cloning, has highlighted the requirement for a complete and accurate physical map of this plant species. To supply the scientific community with a high-quality resource, we present here a complete physical map of A. thaliana using essentially the IGF BAC library. The map consists of 27 contigs that cover the entire genome, except for the presumptive centromeric regions, nucleolar organization regions (NOR) and telomeric areas. This is the first reported map of a complex organism based entirely on BAC clones and it represents the most homogeneous and complete physical map established to date for any plant genome. Furthermore, the analysis performed here serves as a model for an efficient physical mapping procedure using BAC clones that can be applied to other complex genomes.  相似文献   

2.
The genome of the extremophile crucifer Thellungiella parvula   总被引:1,自引:0,他引:1  
Thellungiella parvula is related to Arabidopsis thaliana and is endemic to saline, resource-poor habitats, making it a model for the evolution of plant adaptation to extreme environments. Here we present the draft genome for this extremophile species. Exclusively by next generation sequencing, we obtained the de novo assembled genome in 1,496 gap-free contigs, closely approximating the estimated genome size of 140 Mb. We anchored these contigs to seven pseudo chromosomes without the use of maps. We show that short reads can be assembled to a near-complete chromosome level for a eukaryotic species lacking prior genetic information. The sequence identifies a number of tandem duplications that, by the nature of the duplicated genes, suggest a possible basis for T. parvula's extremophile lifestyle. Our results provide essential background for developing genomically influenced testable hypotheses for the evolution of environmental stress tolerance.  相似文献   

3.
The completed draft version of the human genome, comprised of multiple short contigs encompassing 85% or more of euchromatin, was announced in June of 2000 (ref. 1). The detailed findings of the sequencing consortium were reported several months later. The draft sequence has provided insight into global characteristics, such as the total number of genes and a more accurate definition of gene families. Also of importance are genome positional details such as local genome architecture, regional gene density and the location of transcribed units that are critical for disease gene identification. We carried out a series of mapping and computational experiments using a nonredundant collection of 925 expressed sequence tags (ESTs) and sections of the public draft genome sequence that were available at different timepoints between April 2000 and April 2001. We found discrepancies in both the reported coverage of the human genome and the accuracy of mapping of genomic clones, suggesting some limitations of the draft genome sequence in providing accurate positional information and detailed characterization of chromosomal subregions.  相似文献   

4.
We have constructed a BAC framework map of the mouse genome consisting of 2,808 PCR-confirmed BAC clusters, using a previously described method. Fingerprints of BACs from selected clusters confirm the accuracy of the map. Combined with BAC fingerprint data, the framework map covers 37% of the mouse genome.  相似文献   

5.
Large scale sequencing of cDNAs provides a complementary approach to structural analysis of the human genome by generating expressed sequence tags (ESTs). We have initiated the large-scale sequencing of a 3'-directed cDNA library from the human liver cell line HepG2, that is a non-biased representation of the mRNA population. 982 random cDNA clones were sequenced yielding more than 270 kilobases. A significant portion of the identified genes encoded secretable proteins and components for protein-synthesis. The abundance of cDNA species varied from 2.2% to less than 0.004%. Fifty two percent of the mRNA were abundant species consisting of 173 genes and the rest were non-abundant, consisting of about 6,600 genes.  相似文献   

6.
The genome of the fission yeast, Schizosaccharomyces pombe, consists of some 14 million base pairs of DNA contained in three chromosomes. On account of its excellent genetics we used it as a test system for a strategy designed to map mammalian chromosomes and genomes. Data obtained from hybridization fingerprinting established an ordered library of 1,248 yeast artificial chromosome clones with an average size of 535 kilobases. The clones fall into three contigs completely representing the three chromosomes of the organism. This work provides a high resolution physical and clone map of the genome, which has been related to available genetic and physical map information.  相似文献   

7.
A survey of expressed genes in Caenorhabditis elegans.   总被引:29,自引:0,他引:29  
As an adjunct to the genomic sequencing of Caenorhabditis elegans, we have investigated a representative cDNA library of 1,517 clones. A single sequence read has been obtained from the 5' end of each clone, allowing its characterization with respect to the public databases, and the clones are being localized on the genome map. The result is the identification of about 1,200 of the estimated 15,000 genes of C. elegans. More than 30% of the inferred protein sequences have significant similarity to existing sequences in the databases, providing a route towards in vivo analysis of known genes in the nematode. These clones also provide material for assessing the accuracy of predicted exons and splicing patterns and will lead to a more accurate estimate of the total number of genes in the organism than has hitherto been available.  相似文献   

8.
We report the 207-Mb genome sequence of the North American Arabidopsis lyrata strain MN47 based on 8.3× dideoxy sequence coverage. We predict 32,670 genes in this outcrossing species compared to the 27,025 genes in the selfing species Arabidopsis thaliana. The much smaller 125-Mb genome of A. thaliana, which diverged from A. lyrata 10 million years ago, likely constitutes the derived state for the family. We found evidence for DNA loss from large-scale rearrangements, but most of the difference in genome size can be attributed to hundreds of thousands of small deletions, mostly in noncoding DNA and transposons. Analysis of deletions and insertions still segregating in A. thaliana indicates that the process of DNA loss is ongoing, suggesting pervasive selection for a smaller genome. The high-quality reference genome sequence for A. lyrata will be an important resource for functional, evolutionary and ecological studies in the genus Arabidopsis.  相似文献   

9.
Genome evolution studies for the phylum Nematoda have been limited by focusing on comparisons involving Caenorhabditis elegans. We report a draft genome sequence of Trichinella spiralis, a food-borne zoonotic parasite, which is the most common cause of human trichinellosis. This parasitic nematode is an extant member of a clade that diverged early in the evolution of the phylum, enabling identification of archetypical genes and molecular signatures exclusive to nematodes. We sequenced the 64-Mb nuclear genome, which is estimated to contain 15,808 protein-coding genes, at ~35-fold coverage using whole-genome shotgun and hierarchal map-assisted sequencing. Comparative genome analyses support intrachromosomal rearrangements across the phylum, disproportionate numbers of protein family deaths over births in parasitic compared to a non-parasitic nematode and a preponderance of gene-loss and -gain events in nematodes relative to Drosophila melanogaster. This genome sequence and the identified pan-phylum characteristics will contribute to genome evolution studies of Nematoda as well as strategies to combat global parasites of humans, food animals and crops.  相似文献   

10.
To test the hypothesis that the human genome project will uncover many genes not previously discovered by sequencing of expressed sequence tags (ESTs), we designed and produced a set of microarrays using probes based on open reading frames (ORFs) in 350 Mb of finished and draft human sequence. Our approach aims to identify all genes directly from genomic sequence by querying gene expression. We analysed genomic sequence with a suite of ORF prediction programs, selected approximately one ORF per gene, amplified the ORFs from genomic DNA and arrayed the amplicons onto treated glass slides. Of the first 10,000 arrayed ORFs, 31% are completely novel and 29% are similar, but not identical, to sequences in public databases. Approximately one-half of these are expressed in the tissues we queried by microarray. Subsequent verification by other techniques confirmed expression of several of the novel genes. Expressed sequence tags (ESTs) have yielded vast amounts of data, but our results indicate that many genes in the human genome will only be found by genomic sequencing.  相似文献   

11.
Telomere-associated chromosome fragmentation (TACF) is a new approach for chromosome mapping based on the non-targeted introduction of cloned telomeres into mammalian cells. TACF has been used to generate a panel of somatic cell hybrids with nested terminal deletions of the long arm of the human X chromosome, extending from Xq26 to the centromere. This panel has been characterized using a series of X chromosome loci. Recovery of the end clones by plasmid rescue produces a telomeric marker for each cell line and partial sequencing will allow the generation of sequence tagged sites (STSs). TACF provides a powerful and widely applicable method for genome analysis, a general way of manipulating mammalian chromosomes and a first step towards constructing artificial mammalian chromosomes.  相似文献   

12.
Detecting genetic variants that are highly divergent from a reference sequence remains a major challenge in genome sequencing. We introduce de novo assembly algorithms using colored de Bruijn graphs for detecting and genotyping simple and complex genetic variants in an individual or population. We provide an efficient software implementation, Cortex, the first de novo assembler capable of assembling multiple eukaryotic genomes simultaneously. Four applications of Cortex are presented. First, we detect and validate both simple and complex structural variations in a high-coverage human genome. Second, we identify more than 3 Mb of sequence absent from the human reference genome, in pooled low-coverage population sequence data from the 1000 Genomes Project. Third, we show how population information from ten chimpanzees enables accurate variant calls without a reference sequence. Last, we estimate classical human leukocyte antigen (HLA) genotypes at HLA-B, the most variable gene in the human genome.  相似文献   

13.
The genome of the mesopolyploid crop species Brassica rapa   总被引:21,自引:0,他引:21  
We report the annotation and analysis of the draft genome sequence of Brassica rapa accession Chiifu-401-42, a Chinese cabbage. We modeled 41,174 protein coding genes in the B. rapa genome, which has undergone genome triplication. We used Arabidopsis thaliana as an outgroup for investigating the consequences of genome triplication, such as structural and functional evolution. The extent of gene loss (fractionation) among triplicated genome segments varies, with one of the three copies consistently retaining a disproportionately large fraction of the genes expected to have been present in its ancestor. Variation in the number of members of gene families present in the genome may contribute to the remarkable morphological plasticity of Brassica species. The B. rapa genome sequence provides an important resource for studying the evolution of polyploid genomes and underpins the genetic improvement of Brassica oil and vegetable crops.  相似文献   

14.
On the subspecific origin of the laboratory mouse   总被引:11,自引:0,他引:11  
The genome of the laboratory mouse is thought to be a mosaic of regions with distinct subspecific origins. We have developed a high-resolution map of the origin of the laboratory mouse by generating 25,400 phylogenetic trees at 100-kb intervals spanning the genome. On average, 92% of the genome is of Mus musculus domesticus origin, and the distribution of diversity is markedly nonrandom among the chromosomes. There are large regions of extremely low diversity, which represent blind spots for studies of natural variation and complex traits, and hot spots of diversity. In contrast with the mosaic model, we found that most of the genome has intermediate levels of variation of intrasubspecific origin. Finally, mouse strains derived from the wild that are supposed to represent different mouse subspecies show substantial intersubspecific introgression, which has strong implications for evolutionary studies that assume these are pure representatives of a given subspecies.  相似文献   

15.
16.
Restriction enzyme-generated siRNA (REGS) vectors and libraries   总被引:11,自引:0,他引:11  
Small interfering RNA (siRNA) technology facilitates the study of loss of gene function in mammalian cells and animal models, but generating multiple siRNA vectors using oligonucleotides is slow, inefficient and costly. Here we describe a new, enzyme-mediated method for generating numerous functional siRNA constructs from any gene of interest or pool of genes. To test our restriction enzyme-generated siRNA (REGS) system, we silenced a transgene and two endogenous genes and obtained the predicted phenotypes. REGS generated on average 34 unique siRNAs per kilobase of sequence. REGS enabled us to create enzymatically a complex siRNA library (>4 x 10(5) clones) from double-stranded cDNA encompassing known and unknown genes with 96% of the clones containing inserts of the appropriate size.  相似文献   

17.
18.
The number of genes in the human genome is unknown, with estimates ranging from 50,000 to 90,000 (refs 1, 2), and to more than 140,000 according to unpublished sources. We have developed 'Exofish', a procedure based on homology searches, to identify human genes quickly and reliably. This method relies on the sequence of another vertebrate, the pufferfish Tetraodon nigroviridis, to detect conserved sequences with a very low background. Similar to Fugu rubripes, a marine pufferfish proposed by Brenner et al. as a model for genomic studies, T. nigroviridis is a more practical alternative with a genome also eight times more compact than that of human. Many comparisons have been made between F. rubripes and human DNA that demonstrate the potential of comparative genomics using the pufferfish genome. Application of Exofish to the December version of the working draft sequence of the human genome and to Unigene showed that the human genome contains 28,000-34,000 genes, and that Unigene contains less than 40% of the protein-coding fraction of the human genome.  相似文献   

19.
A radiation hybrid map of mouse genes   总被引:13,自引:0,他引:13  
A comprehensive gene-based map of a genome is a powerful tool for genetic studies and is especially useful for the positional cloning and positional candidate approaches. The availability of gene maps for multiple organisms provides the foundation for detailed conserved-orthology maps showing the correspondence between conserved genomic segments. These maps make it possible to use cross-species information in gene hunts and shed light on the evolutionary forces that shape the genome. Here we report a radiation hybrid map of mouse genes, a combined project of the Whitehead Institute/Massachusetts Institute of Technology Center for Genome Research, the Medical Research Council UK Mouse Genome Centre, and the National Center for Biotechnology Information. The map contains 11,109 genes, screened against the T31 RH panel and positioned relative to a reference map containing 2,280 mouse genetic markers. It includes 3,658 genes homologous to the human genome sequence and provides a framework for overlaying the human genome sequence to the mouse and for sequencing the mouse genome.  相似文献   

20.
Opinions on the hypothesis that ancient genome duplications contributed to the vertebrate genome range from strong skepticism to strong credence. Previous studies concentrated on small numbers of gene families or chromosomal regions that might not have been representative of the whole genome, or used subjective methods to identify paralogous genes and regions. Here we report a systematic and objective analysis of the draft human genome sequence to identify paralogous chromosomal regions (paralogons) formed during chordate evolution and to estimate the ages of duplicate genes. We found that the human genome contains many more paralogons than would be expected by chance. Molecular clock analysis of all protein families in humans that have orthologs in the fly and nematode indicated that a burst of gene duplication activity took place in the period 350 650 Myr ago and that many of the duplicate genes formed at this time are located within paralogons. Our results support the contention that many of the gene families in vertebrates were formed or expanded by large-scale DNA duplications in an early chordate. Considering the incompleteness of the sequence data and the antiquity of the event, the results are compatible with at least one round of polyploidy.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号