首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
The use of comparative genomics to infer genome function relies on the understanding of how different components of the genome change over evolutionary time. The aim of such comparative analysis is to identify conserved, functionally transcribed sequences such as protein-coding genes and non-coding RNA genes, and other functional sequences such as regulatory regions, as well as other genomic features. Here, we have compared the entire human chromosome 21 with syntenic regions of the mouse genome, and have identified a large number of conserved blocks of unknown function. Although previous studies have made similar observations, it is unknown whether these conserved sequences are genes or not. Here we present an extensive experimental and computational analysis of human chromosome 21 in an effort to assign function to sequences conserved between human chromosome 21 (ref. 8) and the syntenic mouse regions. Our data support the presence of a large number of potentially functional non-genic sequences, probably regulatory and structural. The integration of the properties of the conserved components of human chromosome 21 to the rapidly accumulating functional data for this chromosome will improve considerably our understanding of the role of sequence conservation in mammalian genomes.  相似文献   

2.
The sequence of the rice genome holds fundamental information for its biology, including physiology, genetics, development, and evolution, as well as information on many beneficial phenotypes of economic significance. Using a “whole genome shotgun” approach, we have produced a draft rice genome sequence ofOryza sativa ssp.indica, the major crop rice subspecies in China and many other regions of Asia. The draft genome sequence is constructed from over 4.3 million successful sequencing traces with an accumulative total length of 2214.9 Mb. The initial assembly of the non-redundant sequences reached 409.76 Mb in length, based on 3.30 million successful sequencing traces with a total length of 1797.4 Mb from anindica variant cultivar93-11, giving an estimated coverage of 95.29% of the rice genome with an average base accuracy of higher than 99%. The coverage of the draft sequence, the randomness of the sequence distribution, and the consistency of BIG-ASSEMBLER, a custom-designed software package used for the initial assembly, were verified rigorously by comparisons against finished BAC clone sequences from bothindica andjapanica strains, available from the public databases. Over all, 96.3% of full-length cDNAs, 96.4% of STS, STR, RFLP markers, 94.0% of ESTs and 94.9% unigene clusters were identified from the draft sequence. Our preliminary analysis on the data set shows that our rice draft sequence is consistent with the comman standard accepted by the genome sequencing community. The unconditional release of the draft to the public also undoubtedly provides a fundamental resource to the international scientific communities to facilitate genomic and genetic studies on rice biology. These authors contributed equally to this work.  相似文献   

3.
L Emorine  M Kuehl  L Weir  P Leder  E E Max 《Nature》1983,304(5925):447-449
Several functionally important genetic elements (such as the TATA box, mRNA splice sequences, poly(A) addition signal) were first detected as short segments of unexplained sequence homology within non-coding regions of different genes. A short region of unknown sequence in the intron between the human J kappa and C kappa immunoglobulin coding regions was found to be sufficiently homologous to the corresponding segment of the mouse gene to form stable heteroduplexes. Although no specific function has yet been definitely ascribed to this region (which we call the kappa intron conserved region, or KICR), some functional significance has been inferred from the findings that (1) activation of B lymphocytes induces a DNase hypersensitivity site in this region and (2) deletions including this region reduce expression of kappa genes introduced into lymphoid cells. To delineate the KICR more precisely and to test the generality of the sequence conservation in a third species, we have sequenced this region of the human and mouse genes and have examined the corresponding region of a recently cloned rabbit kappa gene. We find a segment of about 130 base pairs (bp) that shows striking conservation in all three species, demonstrating homology significantly higher than within the C kappa coding region itself. Two short sequences from the J kappa-C kappa intron that were noted by other investigators to be homologous to proposed 'enhancer' sequences both lie within the conserved region.  相似文献   

4.
Identifying the sequences that direct the spatial and temporal expression of genes and defining their function in vivo remains a significant challenge in the annotation of vertebrate genomes. One major obstacle is the lack of experimentally validated training sets. In this study, we made use of extreme evolutionary sequence conservation as a filter to identify putative gene regulatory elements, and characterized the in vivo enhancer activity of a large group of non-coding elements in the human genome that are conserved in human-pufferfish, Takifugu (Fugu) rubripes, or ultraconserved in human-mouse-rat. We tested 167 of these extremely conserved sequences in a transgenic mouse enhancer assay. Here we report that 45% of these sequences functioned reproducibly as tissue-specific enhancers of gene expression at embryonic day 11.5. While directing expression in a broad range of anatomical structures in the embryo, the majority of the 75 enhancers directed expression to various regions of the developing nervous system. We identified sequence signatures enriched in a subset of these elements that targeted forebrain expression, and used these features to rank all approximately 3,100 non-coding elements in the human genome that are conserved between human and Fugu. The testing of the top predictions in transgenic mice resulted in a threefold enrichment for sequences with forebrain enhancer activity. These data dramatically expand the catalogue of human gene enhancers that have been characterized in vivo, and illustrate the utility of such training sets for a variety of biological applications, including decoding the regulatory vocabulary of the human genome.  相似文献   

5.
Chromosome 13 is the largest acrocentric human chromosome. It carries genes involved in cancer including the breast cancer type 2 (BRCA2) and retinoblastoma (RB1) genes, is frequently rearranged in B-cell chronic lymphocytic leukaemia, and contains the DAOA locus associated with bipolar disorder and schizophrenia. We describe completion and analysis of 95.5 megabases (Mb) of sequence from chromosome 13, which contains 633 genes and 296 pseudogenes. We estimate that more than 95.4% of the protein-coding genes of this chromosome have been identified, on the basis of comparison with other vertebrate genome sequences. Additionally, 105 putative non-coding RNA genes were found. Chromosome 13 has one of the lowest gene densities (6.5 genes per Mb) among human chromosomes, and contains a central region of 38 Mb where the gene density drops to only 3.1 genes per Mb.  相似文献   

6.
In order to understand the genomic changes during the evolution of hexaploid wheat,two sets of synthetic hexaploid wheat from hybridization between maternal tetraploid wheat (AABB) and paternal diploid goat grass(DD)were used for DNA-AFLP and single strand conformation polymorphism (SSCP) analysis to determine the genomic and genie variation in the synthetic hexaploid wheat.Results indicated that more DNA sequences from paternal diploid species wen eliminated in the synthetic hexaploid wheat than from maternal tetraploid wheat,suggesting that genome from parental species of lower ploidity tends to be eliminated preferentially.However,sequence variation detected by SSCP procedure was much lower than those detected by DNA-AFLP.which indicated that much less variation in the genie regions occurred in the synthetic hexaploid wheat.and sequence variations detected by DNA-AFLP could be derived mostly from non-coding regions and repetitive sequences.Our results also indicated that sequence variation in 4 genes can be detected in hybrid F1.which suggested that this type of sequence variation could be resulted from distant hybridization.It was interesting to note that 3 out of the 4 genes were mapped and clustered on the long alTll of chromosome 2D,which indicated that variation in genic sequences in synthetic hexaploid wheat might not be a randomized process.  相似文献   

7.
In order to understand the genomic changes during evolution of hexaploid wheat, two sets of synthetic hexaploid wheat from hybridization between maternal tetraploid wheat (AABB) and paternal diploid goat grass (DD) were used for DNA-AFLP and single strand conformation polymorphism (SSCP) analysis to determine the genomic and genic variation in the synthetic hexaploid wheat. Results indicated that more DNA sequences from paternal diploid species were eliminated in the synthetic hexaploid wheat than from maternal tetraploid wheat, suggesting that genome from parental species of lower ploidity tends to be eliminated preferentially. However, sequence variation detected by SSCP procedure was much lower than those detected by DNA-AFLP, which indicated that much less variation in the genic regions occurred in the synthetic hexaploid wheat, and sequence variations detected by DNA-AFLP could be derived mostly from non-coding regions and repetitive sequences. Our results also indicated that sequence variation in 4 genes can be detected in hybrid F1, which suggested that this type of sequence variation could be resulted from distant hybridization. It was interesting to note that 3 out of the 4 genes were mapped and clustered on the long arm of chromosome 2D, which indicated that variation in genic sequences in synthetic hexaploid wheat might not be a randomized process.  相似文献   

8.
In order to understand the genomic changes during the evolution of hexaploid wheat, two sets of synthetic hexaploid wheat from hybridization between maternal tetraploid wheat (AABB) and paternal diploid goat grass (DD) were used for DNA-AFLP and single strand conformation polymorphism (SSCP) analysis to determine the genomic and genic variation in the synthetic hexaploid wheat. Results indicated that more DNA sequences from paternal diploid species were eliminated in the synthetic hexaploid wheat than from maternal tetraploid wheat, suggesting that genome from parental species of lower ploidity tends to be eliminated preferentially. However, sequence variation detected by SSCP procedure was much lower than those detected by DNA-AFLP, which indicated that much less variation in the genic regions occurred in the synthetic hexaploid wheat, and sequence variations detected by DNA-AFLP could be derived mostly from non-coding regions and repetitive sequences. Our results also indicated that sequence variation in 4 genes can be detected in hybrid F1, which suggested that this type of sequence variation could be resulted from distant hybridization. It was interesting to note that 3 out of the 4 genes were mapped and clustered on the long arm of chromosome 2D, which indicated that variation in genic sequences in synthetic hexaploid wheat might not be a randomized process.  相似文献   

9.
Evolution of genes and genomes on the Drosophila phylogeny   总被引:2,自引:0,他引:2  
Comparative analysis of multiple genomes in a phylogenetic framework dramatically improves the precision and sensitivity of evolutionary inference, producing more robust results than single-genome analyses can provide. The genomes of 12 Drosophila species, ten of which are presented here for the first time (sechellia, simulans, yakuba, erecta, ananassae, persimilis, willistoni, mojavensis, virilis and grimshawi), illustrate how rates and patterns of sequence divergence across taxa can illuminate evolutionary processes on a genomic scale. These genome sequences augment the formidable genetic tools that have made Drosophila melanogaster a pre-eminent model for animal genetics, and will further catalyse fundamental research on mechanisms of development, cell biology, genetics, disease, neurobiology, behaviour, physiology and evolution. Despite remarkable similarities among these Drosophila species, we identified many putatively non-neutral changes in protein-coding genes, non-coding RNA genes, and cis-regulatory regions. These may prove to underlie differences in the ecology and behaviour of these diverse species.  相似文献   

10.
Genome-scale DNA methylation maps of pluripotent and differentiated cells   总被引:3,自引:0,他引:3  
DNA methylation is essential for normal development and has been implicated in many pathologies including cancer. Our knowledge about the genome-wide distribution of DNA methylation, how it changes during cellular differentiation and how it relates to histone methylation and other chromatin modifications in mammals remains limited. Here we report the generation and analysis of genome-scale DNA methylation profiles at nucleotide resolution in mammalian cells. Using high-throughput reduced representation bisulphite sequencing and single-molecule-based sequencing, we generated DNA methylation maps covering most CpG islands, and a representative sampling of conserved non-coding elements, transposons and other genomic features, for mouse embryonic stem cells, embryonic-stem-cell-derived and primary neural cells, and eight other primary tissues. Several key findings emerge from the data. First, DNA methylation patterns are better correlated with histone methylation patterns than with the underlying genome sequence context. Second, methylation of CpGs are dynamic epigenetic marks that undergo extensive changes during cellular differentiation, particularly in regulatory regions outside of core promoters. Third, analysis of embryonic-stem-cell-derived and primary cells reveals that 'weak' CpG islands associated with a specific set of developmentally regulated genes undergo aberrant hypermethylation during extended proliferation in vitro, in a pattern reminiscent of that reported in some primary tumours. More generally, the results establish reduced representation bisulphite sequencing as a powerful technology for epigenetic profiling of cell populations relevant to developmental biology, cancer and regenerative medicine.  相似文献   

11.
The aspergilli comprise a diverse group of filamentous fungi spanning over 200 million years of evolution. Here we report the genome sequence of the model organism Aspergillus nidulans, and a comparative study with Aspergillus fumigatus, a serious human pathogen, and Aspergillus oryzae, used in the production of sake, miso and soy sauce. Our analysis of genome structure provided a quantitative evaluation of forces driving long-term eukaryotic genome evolution. It also led to an experimentally validated model of mating-type locus evolution, suggesting the potential for sexual reproduction in A. fumigatus and A. oryzae. Our analysis of sequence conservation revealed over 5,000 non-coding regions actively conserved across all three species. Within these regions, we identified potential functional elements including a previously uncharacterized TPP riboswitch and motifs suggesting regulation in filamentous fungi by Puf family genes. We further obtained comparative and experimental evidence indicating widespread translational regulation by upstream open reading frames. These results enhance our understanding of these widely studied fungi as well as provide new insight into eukaryotic genome evolution and gene regulation.  相似文献   

12.
Single gene circles in dinoflagellate chloroplast genomes.   总被引:25,自引:0,他引:25  
Z Zhang  B R Green  T Cavalier-Smith 《Nature》1999,400(6740):155-159
Photosynthetic dinoflagellates are important aquatic primary producers and notorious causes of toxic 'red tides'. Typical dinoflagellate chloroplasts differ from all other plastids in having a combination of three envelope membranes and peridinin-chlorophyll a/c light-harvesting pigments. Despite evidence of a dinoflagellete satellite DNA containing chloroplast genes, previous attempts to obtain chloroplast gene sequences have been uniformly unsuccessful. Here we show that the dinoflagellate chloroplast DNA genome structure is unique. Complete sequences of chloroplast ribosomal RNA genes and seven chloroplast protein genes from the dinoflagellate Heterocapsa triquetra reveal that each is located alone on a separate minicircular chromosome: 'one gene-one circle'. The genes are the most divergent known from chloroplast genomes. Each circle has an unusual tripartite non-coding region (putative replicon origin), which is highly conserved among the nine circles through extensive gene conversion, but is very divergent between species. Several other dinoflagellate species have minicircular chloroplast genes, indicating that this type of genomic organization may have evolved in ancestral peridinean dinoflagellates. Phylogenetic analysis indicates that dinoflagellate chloroplasts are related to chromistan and red algal chloroplasts and supports their origin by secondary symbiogenesis.  相似文献   

13.
We present here a draft genome sequence of the red jungle fowl, Gallus gallus. Because the chicken is a modern descendant of the dinosaurs and the first non-mammalian amniote to have its genome sequenced, the draft sequence of its genome--composed of approximately one billion base pairs of sequence and an estimated 20,000-23,000 genes--provides a new perspective on vertebrate genome evolution, while also improving the annotation of mammalian genomes. For example, the evolutionary distance between chicken and human provides high specificity in detecting functional elements, both non-coding and coding. Notably, many conserved non-coding sequences are far from genes and cannot be assigned to defined functional classes. In coding regions the evolutionary dynamics of protein domains and orthologous groups illustrate processes that distinguish the lineages leading to birds and mammals. The distinctive properties of avian microchromosomes, together with the inferred patterns of conserved synteny, provide additional insights into vertebrate chromosome architecture.  相似文献   

14.
15.
Sequence and analysis of rice chromosome 4   总被引:1,自引:0,他引:1  
Feng Q  Zhang Y  Hao P  Wang S  Fu G  Huang Y  Li Y  Zhu J  Liu Y  Hu X  Jia P  Zhang Y  Zhao Q  Ying K  Yu S  Tang Y  Weng Q  Zhang L  Lu Y  Mu J  Lu Y  Zhang LS  Yu Z  Fan D  Liu X  Lu T  Li C  Wu Y  Sun T  Lei H  Li T  Hu H  Guan J  Wu M  Zhang R  Zhou B  Chen Z  Chen L  Jin Z  Wang R  Yin H  Cai Z  Ren S  Lv G  Gu W  Zhu G  Tu Y  Jia J  Zhang Y  Chen J  Kang H  Chen X  Shao C  Sun Y  Hu Q  Zhang X  Zhang W  Wang L  Ding C  Sheng H  Gu J  Chen S  Ni L  Zhu F  Chen W  Lan L  Lai Y  Cheng Z  Gu M  Jiang J  Li J  Hong G  Xue Y  Han B 《Nature》2002,420(6913):316-320
Rice is the principal food for over half of the population of the world. With its genome size of 430 megabase pairs (Mb), the cultivated rice species Oryza sativa is a model plant for genome research. Here we report the sequence analysis of chromosome 4 of O. sativa, one of the first two rice chromosomes to be sequenced completely. The finished sequence spans 34.6 Mb and represents 97.3% of the chromosome. In addition, we report the longest known sequence for a plant centromere, a completely sequenced contig of 1.16 Mb corresponding to the centromeric region of chromosome 4. We predict 4,658 protein coding genes and 70 transfer RNA genes. A total of 1,681 predicted genes match available unique rice expressed sequence tags. Transposable elements have a pronounced bias towards the euchromatic regions, indicating a close correlation of their distributions to genes along the chromosome. Comparative genome analysis between cultivated rice subspecies shows that there is an overall syntenic relationship between the chromosomes and divergence at the level of single-nucleotide polymorphisms and insertions and deletions. By contrast, there is little conservation in gene order between rice and Arabidopsis.  相似文献   

16.
The genomic sequence of the attenuated hog cholera virus Lapinized Chinese strain (HCLV) was determined from overlapping cDNA clones. The viral RNA of HCLV stain comprised 12 310 nucleotide (nt) including 374 nt and 239 nt at the 5′ and 3′-noncoding region, respectively. The complete genome sequence contained one large open reading frame which encoded an amino acid sequence of 3 898 residues with a calculated molecular weight of 437×103. Although there were mostly only small differences between the sequence of the HCLV strain and the published sequences of strains ALD, GPE, Alfort and Brescia, there was one notable insertion of 12 nucleotides, TTTTCTTTTTTC in the 3′ non-coding region of HCLV strain. Supported by the National Pandeng Project, Genbank accession number AF091507 Wang Jiafu: born in 1972, Ph. D.  相似文献   

17.
以兔骨胳肌为实验材料,构建了兔骨骼肌cDNA文库,根据该基因保守序列,设计简并引物,利用RT-PCR技术,克隆了兔MSTN基因EST片段,以EST片段为探针,应用Southern杂交技术筛选文库,克隆了兔肌肉生长抑制素基因全长cDNA并在GenBank注册(注册号:AY169410).用生物信息学方法对该基因进行了比较分析,表明从氨基酸序列及进化角度兔和其他哺乳类生物的肌肉生长抑制素基因之间关系密切.  相似文献   

18.
By scanning the whole genomic sequence of japonica rice using 45 known plant disease resistance (R) genes, we identified 2119 resistance gene homologs or analogs (RGAs) and verified that RGAs are not randomly distributed but tend to cluster in the rice genome. The RGAs were classified into 21 families according to their functional domain based on Hidden Markov model (HMM). By comparing the RGAs of japonica rice with the whole genomlc sequence of indica rice, we found 702 RGAs allelic between the two subspecies and revealed that 671 (95.6%) of them have length difference (InDels) in their genomic sequences (including coding and non-coding regions) between the two subspecies, suggesting that RGAs are highly polymorphic between the two subspecies in rice. We also exploited 402 PCR-based and co-dominant candidate RGA markers by designing primer pairs on the regions flanking the lnDels and validating them via e-PCR. The length differences of the candidate RGA markers between the two subspecies are from 1 to 742 hp, with an average of 10.26 hp. All related information of the RGAs is available from our web site(http://ibi.zju.edu.cn/RGAs/index.html).  相似文献   

19.
【目的】了解鹅耳枥属(Carpinus)树种叶绿体基因组基因组成及结构特征,为鹅耳枥属的系统发育及基因组进化研究提供参考。【方法】获取鹅耳枥属16个树种的叶绿体基因组,对其进行基因注释,利用生物信息学方法比较叶绿体基因组间的结构特征与变异程度,并以麻栎(Quercus acutissima)为外类群分析了鹅耳枥属的系统发育关系。【结果】鹅耳枥属16个树种的叶绿体基因组均为双链环形结构,均包含1个长单拷贝区(LSC)、1个短单拷贝区(SSC)以及2个反向重复区(IRa和IRb)。叶绿体基因组大小差异较小,最大差异仅1 902 bp。基因排列顺序基本一致,各基因数量相对保守,其中核糖体RNA(rRNA)数量最为保守,所有树种均为8个。此外,鹅耳枥属树种叶绿体基因组在序列长度、基因组成以及GC含量等方面相对保守,但4个边界存在明显的多样性。鹅耳枥属叶绿体基因组中非基因编码区存在较大差异,变异程度较高,而基因编码区差异较小,具有较高的保守性。在叶绿体基因组4个部分中,LSC区的变异程度最高,IRa区的变异程度最低。鹅耳枥属叶绿体基因组中psbArps16atpArps19ndhFndhI以及ycf1等基因的编码区存在显著差异。此外,ycf3-trnS, trnS-rps4, trnH-psbA, psbZ-trnfM, matK-rps16, rps16-trnQ, trnQ-psbK, ccsA-ndhD, accD-psaI, ndhC-trnV, trnT-trnL, trnF-ndhJ, atpB-rbcL, trnT-psbD, trnE-trnT, trnD-trnY, rpl32-trnl等基因间隔区的非编码区差异较大。绝大部分基因的编码区长度十分保守,含内含子的蛋白编码基因长度变异主要来源于内含子长度或编码区长度。系统发育分析结果将鹅耳枥属划分为鹅耳枥组与千金榆组,此外由于地理隔离导致欧洲鹅耳枥(C. betulus)、美洲鹅耳枥(C. caroliniana)与鹅耳枥属其他树种表现出较远的亲缘关系。【结论】鹅耳枥属树种叶绿体基因组具有较高的保守性,其基因排列顺序基本一致,未检测到大规模的倒位或基因重排,但其IR区与单拷贝区(SC)边界存在明显的多样性。基于叶绿体基因组构建的系统发育树在一定程度上可以揭示鹅耳枥属树种的系统发育关系。  相似文献   

20.
A physical map of the human Y chromosome   总被引:24,自引:0,他引:24  
The non-recombining region of the human Y chromosome (NRY), which comprises 95% of the chromosome, does not undergo sexual recombination and is present only in males. An understanding of its biological functions has begun to emerge from DNA studies of individuals with partial Y chromosomes, coupled with molecular characterization of genes implicated in gonadal sex reversal, Turner syndrome, graft rejection and spermatogenic failure. But mapping strategies applied successfully elsewhere in the genome have faltered in the NRY, where there is no meiotic recombination map and intrachromosomal repetitive sequences are abundant. Here we report a high-resolution physical map of the euchromatic, centromeric and heterochromatic regions of the NRY and its construction by unusual methods, including genomic clone subtraction and dissection of sequence family variants. Of the map's 758 DNA markers, 136 have multiple locations in the NRY, reflecting its unusually repetitive sequence composition. The markers anchor 1,038 bacterial artificial chromosome clones, 199 of which form a tiling path for sequencing.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号