首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The higher plant Arabidopsis thaliana (Arabidopsis) is an important model for identifying plant genes and determining their function. To assist biological investigations and to define chromosome structure, a coordinated effort to sequence the Arabidopsis genome was initiated in late 1996. Here we report one of the first milestones of this project, the sequence of chromosome 4. Analysis of 17.38 megabases of unique sequence, representing about 17% of the genome, reveals 3,744 protein coding genes, 81 transfer RNAs and numerous repeat elements. Heterochromatic regions surrounding the putative centromere, which has not yet been completely sequenced, are characterized by an increased frequency of a variety of repeats, new repeats, reduced recombination, lowered gene density and lowered gene expression. Roughly 60% of the predicted protein-coding genes have been functionally characterized on the basis of their homology to known genes. Many genes encode predicted proteins that are homologous to human and Caenorhabditis elegans proteins.  相似文献   

2.
Gene transfer to the nucleus and the evolution of chloroplasts   总被引:61,自引:0,他引:61  
Photosynthetic eukaryotes, particularly unicellular forms, possess a fossil record that is either wrought with gaps or difficult to interpret, or both. Attempts to reconstruct their evolution have focused on plastid phylogeny, but were limited by the amount and type of phylogenetic information contained within single genes. Among the 210 different protein-coding genes contained in the completely sequenced chloroplast genomes from a glaucocystophyte, a rhodophyte, a diatom, a euglenophyte and five land plants, we have now identified the set of 45 common to each and to a cyanobacterial outgroup genome. Phylogenetic inference with an alignment of 11,039 amino-acid positions per genome indicates that this information is sufficient--but just rarely so--to identify the rooted nine-taxon topology. We mapped the process of gene loss from chloroplast genomes across the inferred tree and found that, surprisingly, independent parallel gene losses in multiple lineages outnumber phylogenetically unique losses by more that 4:1. We identified homologues of 44 different plastid-encoded proteins as functional nuclear genes of chloroplast origin, providing evidence for endosymbiotic gene transfer to the nucleus in plants.  相似文献   

3.
The genome of the model plant Arabidopsis thaliana has been sequenced by an international collaboration, The Arabidopsis Genome Initiative. Here we report the complete sequence of chromosome 5. This chromosome is 26 megabases long; it is the second largest Arabidopsis chromosome and represents 21% of the sequenced regions of the genome. The sequence of chromosomes 2 and 4 have been reported previously and that of chromosomes 1 and 3, together with an analysis of the complete genome sequence, are reported in this issue. Analysis of the sequence of chromosome 5 yields further insights into centromere structure and the sequence determinants of heterochromatin condensation. The 5,874 genes encoded on chromosome 5 reveal several new functions in plants, and the patterns of gene organization provide insights into the mechanisms and extent of genome evolution in plants.  相似文献   

4.
Arabidopsis thaliana is an important model system for plant biologists. In 1996 an international collaboration (the Arabidopsis Genome Initiative) was formed to sequence the whole genome of Arabidopsis and in 1999 the sequence of the first two chromosomes was reported. The sequence of the last three chromosomes and an analysis of the whole genome are reported in this issue. Here we present the sequence of chromosome 3, organized into four sequence segments (contigs). The two largest (13.5 and 9.2 Mb) correspond to the top (long) and the bottom (short) arms of chromosome 3, and the two small contigs are located in the genetically defined centromere. This chromosome encodes 5,220 of the roughly 25,500 predicted protein-coding genes in the genome. About 20% of the predicted proteins have significant homology to proteins in eukaryotic genomes for which the complete sequence is available, pointing to important conserved cellular functions among eukaryotes.  相似文献   

5.
The map-based sequence of the rice genome   总被引:14,自引:0,他引:14  
Rice, one of the world's most important food plants, has important syntenic relationships with the other cereal species and is a model plant for the grasses. Here we present a map-based, finished quality sequence that covers 95% of the 389 Mb genome, including virtually all of the euchromatin and two complete centromeres. A total of 37,544 non-transposable-element-related protein-coding genes were identified, of which 71% had a putative homologue in Arabidopsis. In a reciprocal analysis, 90% of the Arabidopsis proteins had a putative homologue in the predicted rice proteome. Twenty-nine per cent of the 37,544 predicted genes appear in clustered gene families. The number and classes of transposable elements found in the rice genome are consistent with the expansion of syntenic regions in the maize and sorghum genomes. We find evidence for widespread and recurrent gene transfer from the organelles to the nuclear chromosomes. The map-based sequence has proven useful for the identification of genes underlying agronomic traits. The additional single-nucleotide polymorphisms and simple sequence repeats identified in our study should accelerate improvements in rice production.  相似文献   

6.
The genome sequence and structure of rice chromosome 1   总被引:2,自引:0,他引:2  
The rice species Oryza sativa is considered to be a model plant because of its small genome size, extensive genetic map, relative ease of transformation and synteny with other cereal crops. Here we report the essentially complete sequence of chromosome 1, the longest chromosome in the rice genome. We summarize characteristics of the chromosome structure and the biological insight gained from the sequence. The analysis of 43.3 megabases (Mb) of non-overlapping sequence reveals 6,756 protein coding genes, of which 3,161 show homology to proteins of Arabidopsis thaliana, another model plant. About 30% (2,073) of the genes have been functionally categorized. Rice chromosome 1 is (G + C)-rich, especially in its coding regions, and is characterized by several gene families that are dispersed or arranged in tandem repeats. Comparison with a draft sequence indicates the importance of a high-quality finished sequence.  相似文献   

7.
We have sequenced and annotated the genome of fission yeast (Schizosaccharomyces pombe), which contains the smallest number of protein-coding genes yet recorded for a eukaryote: 4,824. The centromeres are between 35 and 110 kilobases (kb) and contain related repeats including a highly conserved 1.8-kb element. Regions upstream of genes are longer than in budding yeast (Saccharomyces cerevisiae), possibly reflecting more-extended control regions. Some 43% of the genes contain introns, of which there are 4,730. Fifty genes have significant similarity with human disease genes; half of these are cancer related. We identify highly conserved genes important for eukaryotic cell organization including those required for the cytoskeleton, compartmentation, cell-cycle control, proteolysis, protein phosphorylation and RNA splicing. These genes may have originated with the appearance of eukaryotic life. Few similarly conserved genes that are important for multicellular organization were identified, suggesting that the transition from prokaryotes to eukaryotes required more new genes than did the transition from unicellular to multicellular organization.  相似文献   

8.
Sequence and analysis of rice chromosome 4   总被引:1,自引:0,他引:1  
Feng Q  Zhang Y  Hao P  Wang S  Fu G  Huang Y  Li Y  Zhu J  Liu Y  Hu X  Jia P  Zhang Y  Zhao Q  Ying K  Yu S  Tang Y  Weng Q  Zhang L  Lu Y  Mu J  Lu Y  Zhang LS  Yu Z  Fan D  Liu X  Lu T  Li C  Wu Y  Sun T  Lei H  Li T  Hu H  Guan J  Wu M  Zhang R  Zhou B  Chen Z  Chen L  Jin Z  Wang R  Yin H  Cai Z  Ren S  Lv G  Gu W  Zhu G  Tu Y  Jia J  Zhang Y  Chen J  Kang H  Chen X  Shao C  Sun Y  Hu Q  Zhang X  Zhang W  Wang L  Ding C  Sheng H  Gu J  Chen S  Ni L  Zhu F  Chen W  Lan L  Lai Y  Cheng Z  Gu M  Jiang J  Li J  Hong G  Xue Y  Han B 《Nature》2002,420(6913):316-320
Rice is the principal food for over half of the population of the world. With its genome size of 430 megabase pairs (Mb), the cultivated rice species Oryza sativa is a model plant for genome research. Here we report the sequence analysis of chromosome 4 of O. sativa, one of the first two rice chromosomes to be sequenced completely. The finished sequence spans 34.6 Mb and represents 97.3% of the chromosome. In addition, we report the longest known sequence for a plant centromere, a completely sequenced contig of 1.16 Mb corresponding to the centromeric region of chromosome 4. We predict 4,658 protein coding genes and 70 transfer RNA genes. A total of 1,681 predicted genes match available unique rice expressed sequence tags. Transposable elements have a pronounced bias towards the euchromatic regions, indicating a close correlation of their distributions to genes along the chromosome. Comparative genome analysis between cultivated rice subspecies shows that there is an overall syntenic relationship between the chromosomes and divergence at the level of single-nucleotide polymorphisms and insertions and deletions. By contrast, there is little conservation in gene order between rice and Arabidopsis.  相似文献   

9.
Bundock P  Hooykaas P 《Nature》2005,436(7048):282-284
A significant proportion of the genomes of higher plants and vertebrates consists of transposable elements and their derivatives. Autonomous DNA type transposons encode a transposase that enables them to mobilize to a new chromosomal position in the host genome by a cut-and-paste mechanism. As this is potentially mutagenic, the host limits transposition through epigenetic gene silencing and heterochromatin formation. Here we show that a transposase from Arabidopsis thaliana that we named DAYSLEEPER is essential for normal plant growth; it shares several characteristics with the hAT (hobo, Activator, Tam3) family of transposases. DAYSLEEPER was isolated as a factor binding to a motif (Kubox1) present in the upstream region of the Arabidopsis DNA repair gene Ku70. This motif is also present in the upstream regions of many other plant genes. Plants lacking DAYSLEEPER or strongly overexpressing this gene do not develop in a normal manner. Furthermore, DAYSLEEPER overexpression results in the altered expression of many genes. Our data indicate that transposase-like genes can be essential for plant development and can also regulate global gene expression. Thus, transposases can become domesticated by the host to fulfil important cellular functions.  相似文献   

10.
The genome of the flowering plant Arabidopsis thaliana has five chromosomes. Here we report the sequence of the largest, chromosome 1, in two contigs of around 14.2 and 14.6 megabases. The contigs extend from the telomeres to the centromeric borders, regions rich in transposons, retrotransposons and repetitive elements such as the 180-base-pair repeat. The chromosome represents 25% of the genome and contains about 6,850 open reading frames, 236 transfer RNAs (tRNAs) and 12 small nuclear RNAs. There are two clusters of tRNA genes at different places on the chromosome. One consists of 27 tRNA(Pro) genes and the other contains 27 tandem repeats of tRNA(Tyr)-tRNA(Tyr)-tRNA(Ser) genes. Chromosome 1 contains about 300 gene families with clustered duplications. There are also many repeat elements, representing 8% of the sequence.  相似文献   

11.
Papaya, a fruit crop cultivated in tropical and subtropical regions, is known for its nutritional benefits and medicinal applications. Here we report a 3x draft genome sequence of 'SunUp' papaya, the first commercial virus-resistant transgenic fruit tree to be sequenced. The papaya genome is three times the size of the Arabidopsis genome, but contains fewer genes, including significantly fewer disease-resistance gene analogues. Comparison of the five sequenced genomes suggests a minimal angiosperm gene set of 13,311. A lack of recent genome duplication, atypical of other angiosperm genomes sequenced so far, may account for the smaller papaya gene number in most functional groups. Nonetheless, striking amplifications in gene number within particular functional groups suggest roles in the evolution of tree-like habit, deposition and remobilization of starch reserves, attraction of seed dispersal agents, and adaptation to tropical daylengths. Transgenesis at three locations is closely associated with chloroplast insertions into the nuclear genome, and with topoisomerase I recognition sites. Papaya offers numerous advantages as a system for fruit-tree functional genomics, and this draft genome sequence provides the foundation for revealing the basis of Carica's distinguishing morpho-physiological, medicinal and nutritional properties.  相似文献   

12.
Horizontal gene transfer (HGT) has long been recognized as a principal force in the evolution of genomes. Genome sequences of Archaea and Bacteria have revealed the existence of genes whose similarity to loci in distantly related organisms is explained most parsimoniously by HGT events. In most multicellular organisms, such genetic fixation can occur only in the germ line. Therefore, it is notable that the publication of the human genome reports 113 incidents of direct HGT between bacteria and vertebrates, without any apparent occurrence in evolutionary intermediates, that is, non-vertebrate eukaryotes. Phylogenetic analysis arguably provides the most objective approach for determining the occurrence and directionality of HGT. Here we report a phylogenetic analysis of 28 proposed HGT genes, whose presence in the human genome had been confirmed by polymerase chain reaction (PCR). The results indicate that most putative HGT genes are present in more anciently derived eukaryotes (many such sequences available in non-vertebrate EST databases) and can be explained in terms of descent through common ancestry. They are, therefore, unlikely to be examples of direct HGT from bacteria to vertebrates.  相似文献   

13.
The two-component signaling system has been studied in bacteria. It takes part in signal transduction of adaptive behavior. Recent studies have shown that a similar two-component system is also present in eukaryotes. Examples of this areETRl andCKLl genes which may involve the signal transduction of plant hormone ethylene and cytokinin respectively. The cloning and characterization of a novel gene (NTHKl) fragment from tobacco are presented. Its partial sequence codes for a product which shows similarity to many two-component signaling proteins. Southern blot analysis indicated that there are 2 to 3 copies ofNTHKl gene in tobacco genome (allotetraploid). Homologous genes may also exist in other plants such as Arabidopsis, soybean and spinach. The expression ofNTHKl gene has also been analyzed in tobacco. Further studies on the isolation of full-length cDNA ofNTHKl gene will elucidate more clearly its function in signal perception and transduction.  相似文献   

14.
Complete genomic sequence is known for two multicellular eukaryotes, the nematode Caenorhabditis elegans and the fruit fly Drosophila melanogaster, and it will soon be known for humans. However, biological function has been assigned to only a small proportion of the predicted genes in any animal. Here we have used RNA-mediated interference (RNAi) to target nearly 90% of predicted genes on C. elegans chromosome I by feeding worms with bacteria that express double-stranded RNA. We have assigned function to 13.9% of the genes analysed, increasing the number of sequenced genes with known phenotypes on chromosome I from 70 to 378. Although most genes with sterile or embryonic lethal RNAi phenotypes are involved in basal cell metabolism, many genes giving post-embryonic phenotypes have conserved sequences but unknown function. In addition, conserved genes are significantly more likely to have an RNAi phenotype than are genes with no conservation. We have constructed a reusable library of bacterial clones that will permit unlimited RNAi screens in the future; this should help develop a more complete view of the relationships between the genome, gene function and the environment.  相似文献   

15.
Recent advances have shown that the majorityof the nucleotide variation in human genome is single nucleo-tide polymorphisms (SNPs). Using SNPs each chromosomecan be divided into different haplotype blocks, and there arelimited common haplotypes in each block. This provides apowerful approach for whole genome scan for disease-asso-ciated genes/variants. However, most data available todayare based on the large-scale genomic analyses, data concern-ing individual genes for fine mapping with high density SNPsare relatively lacking. We have sequenced 7 genes and theirflanking regions, identified 34 novel SNPs, constructed highdensity SNP haplotypes and haplotype blocks in 5 genes inthe centromeric region of chromosome 15 in I00 ChineseHart subjects. Our results show that there is a great hetero-geneity in the haplotypes and haplotype block structureswithin and between these genes, which are in close physicalproximity. Data obtained in this study provide a useful toolfor candidate gene approach at the fine scale for identifyingdisease contributing variants in the genes/regions.  相似文献   

16.
Sequence and analysis of chromosome 2 of Dictyostelium discoideum   总被引:1,自引:0,他引:1  
The genome of the lower eukaryote Dictyostelium discoideum comprises six chromosomes. Here we report the sequence of the largest, chromosome 2, which at 8 megabases (Mb) represents about 25% of the genome. Despite an A + T content of nearly 80%, the chromosome codes for 2,799 predicted protein coding genes and 73 transfer RNA genes. This gene density, about 1 gene per 2.6 kilobases (kb), is surpassed only by Saccharomyces cerevisiae (one per 2 kb) and is similar to that of Schizosaccharomyces pombe (one per 2.5 kb). If we assume that the other chromosomes have a similar gene density, we can expect around 11,000 genes in the D. discoideum genome. A significant number of the genes show higher similarities to genes of vertebrates than to those of other fully sequenced eukaryotes. This analysis strengthens the view that the evolutionary position of D. discoideum is located before the branching of metazoa and fungi but after the divergence of the plant kingdom, placing it close to the base of metazoan evolution.  相似文献   

17.
In eukaryotes, the ubiquitin-mediated protein degradation pathway has been shown to control several key biological processes such as cell division, development, metabolism and immune response. F-box proteins, as a part of SCF (Skp1-Cullin (or Cdc53)-F-box) complex, functioned by interacting with substrate proteins, leading to their subsequent degradation by the 26S proteasome. To date, several F-box proteins identified in Arabidopsis and Antirrhinum have been shown to play important roles in auxin signal transduction, floral organ formation, flowering and leaf senescence. Arabidopsis genome sequence analysis revealed that it encodes over 1000 predicted F-box proteins accounting for about 5% of total predicted proteins. These results indicate that the ubiquitin-mediated protein degradation involving the F-box proteins is an important mechanism controlling plant gene expression. Here, we review the known F-box proteins and their functionsin flowering plants.  相似文献   

18.
Xylella fastidiosa is a fastidious, xylem-limited bacterium that causes a range of economically important plant diseases. Here we report the complete genome sequence of X. fastidiosa clone 9a5c, which causes citrus variegated chlorosis--a serious disease of orange trees. The genome comprises a 52.7% GC-rich 2,679,305-base-pair (bp) circular chromosome and two plasmids of 51,158 bp and 1,285 bp. We can assign putative functions to 47% of the 2,904 predicted coding regions. Efficient metabolic functions are predicted, with sugars as the principal energy and carbon source, supporting existence in the nutrient-poor xylem sap. The mechanisms associated with pathogenicity and virulence involve toxins, antibiotics and ion sequestration systems, as well as bacterium-bacterium and bacterium-host interactions mediated by a range of proteins. Orthologues of some of these proteins have only been identified in animal and human pathogens; their presence in X. fastidiosa indicates that the molecular basis for bacterial pathogenicity is both conserved and independent of host. At least 83 genes are bacteriophage-derived and include virulence-associated genes from other bacteria, providing direct evidence of phage-mediated horizontal gene transfer.  相似文献   

19.
The 1,860,725-base-pair genome of Thermotoga maritima MSB8 contains 1,877 predicted coding regions, 1,014 (54%) of which have functional assignments and 863 (46%) of which are of unknown function. Genome analysis reveals numerous pathways involved in degradation of sugars and plant polysaccharides, and 108 genes that have orthologues only in the genomes of other thermophilic Eubacteria and Archaea. Of the Eubacteria sequenced to date, T. maritima has the highest percentage (24%) of genes that are most similar to archaeal genes. Eighty-one archaeal-like genes are clustered in 15 regions of the T. maritima genome that range in size from 4 to 20 kilobases. Conservation of gene order between T. maritima and Archaea in many of the clustered regions suggests that lateral gene transfer may have occurred between thermophilic Eubacteria and Archaea.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号