首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 816 毫秒
1.
The genome of the mesopolyploid crop species Brassica rapa   总被引:21,自引:0,他引:21  
We report the annotation and analysis of the draft genome sequence of Brassica rapa accession Chiifu-401-42, a Chinese cabbage. We modeled 41,174 protein coding genes in the B. rapa genome, which has undergone genome triplication. We used Arabidopsis thaliana as an outgroup for investigating the consequences of genome triplication, such as structural and functional evolution. The extent of gene loss (fractionation) among triplicated genome segments varies, with one of the three copies consistently retaining a disproportionately large fraction of the genes expected to have been present in its ancestor. Variation in the number of members of gene families present in the genome may contribute to the remarkable morphological plasticity of Brassica species. The B. rapa genome sequence provides an important resource for studying the evolution of polyploid genomes and underpins the genetic improvement of Brassica oil and vegetable crops.  相似文献   

2.
The genome of the extremophile crucifer Thellungiella parvula   总被引:1,自引:0,他引:1  
Thellungiella parvula is related to Arabidopsis thaliana and is endemic to saline, resource-poor habitats, making it a model for the evolution of plant adaptation to extreme environments. Here we present the draft genome for this extremophile species. Exclusively by next generation sequencing, we obtained the de novo assembled genome in 1,496 gap-free contigs, closely approximating the estimated genome size of 140 Mb. We anchored these contigs to seven pseudo chromosomes without the use of maps. We show that short reads can be assembled to a near-complete chromosome level for a eukaryotic species lacking prior genetic information. The sequence identifies a number of tandem duplications that, by the nature of the duplicated genes, suggest a possible basis for T. parvula's extremophile lifestyle. Our results provide essential background for developing genomically influenced testable hypotheses for the evolution of environmental stress tolerance.  相似文献   

3.
A complete BAC-based physical map of the Arabidopsis thaliana genome.   总被引:11,自引:0,他引:11  
Arabidopsis thaliana is a small flowering plant that serves as the major model system in plant molecular genetics. The efforts of many scientists have produced genetic maps that provide extensive coverage of the genome (http://genome-www. stanford.edu/Arabidopsis/maps.html). Recently, detailed YAC, BAC, P1 and cosmid-based physical maps (that is, representations of genomic regions as sets of overlapping clones of corresponding libraries) have been established that extend over wide genomic areas ranging from several hundreds of kilobases to entire chromosomes. These maps provide an entry to gain deeper insight into the A. thaliana genome structure. A. thaliana has been chosen as the subject of the first large-scale project intended to determine the full genome sequence of a plant. This sequencing project, together with the increasing interest in map-based gene cloning, has highlighted the requirement for a complete and accurate physical map of this plant species. To supply the scientific community with a high-quality resource, we present here a complete physical map of A. thaliana using essentially the IGF BAC library. The map consists of 27 contigs that cover the entire genome, except for the presumptive centromeric regions, nucleolar organization regions (NOR) and telomeric areas. This is the first reported map of a complex organism based entirely on BAC clones and it represents the most homogeneous and complete physical map established to date for any plant genome. Furthermore, the analysis performed here serves as a model for an efficient physical mapping procedure using BAC clones that can be applied to other complex genomes.  相似文献   

4.
5.
Genome evolution studies for the phylum Nematoda have been limited by focusing on comparisons involving Caenorhabditis elegans. We report a draft genome sequence of Trichinella spiralis, a food-borne zoonotic parasite, which is the most common cause of human trichinellosis. This parasitic nematode is an extant member of a clade that diverged early in the evolution of the phylum, enabling identification of archetypical genes and molecular signatures exclusive to nematodes. We sequenced the 64-Mb nuclear genome, which is estimated to contain 15,808 protein-coding genes, at ~35-fold coverage using whole-genome shotgun and hierarchal map-assisted sequencing. Comparative genome analyses support intrachromosomal rearrangements across the phylum, disproportionate numbers of protein family deaths over births in parasitic compared to a non-parasitic nematode and a preponderance of gene-loss and -gain events in nematodes relative to Drosophila melanogaster. This genome sequence and the identified pan-phylum characteristics will contribute to genome evolution studies of Nematoda as well as strategies to combat global parasites of humans, food animals and crops.  相似文献   

6.
7.
Arabidopsis thaliana has emerged as a model system for studies of plant genetics and development, and its genome has been targeted for sequencing by an international consortium (the Arabidopsis Genome Initiative; http://genome-www. stanford.edu/Arabidopsis/agi.html). To support the genome-sequencing effort, we fingerprinted more than 20,000 BACs (ref. 2) from two high-quality publicly available libraries, generating an estimated 17-fold redundant coverage of the genome, and used the fingerprints to nucleate assembly of the data by computer. Subsequent manual revision of the assemblies resulted in the incorporation of 19,661 fingerprinted BACs into 169 ordered sets of overlapping clones ('contigs'), each containing at least 3 clones. These contigs are ideal for parallel selection of BACs for large-scale sequencing and have supported the generation of more than 5.8 Mb of finished genome sequence submitted to GenBank; analysis of the sequence has confirmed the integrity of contigs constructed using this fingerprint data. Placement of contigs onto chromosomes can now be performed, and is being pursued by groups involved in both sequencing and positional cloning studies. To our knowledge, these data provide the first example of whole-genome random BAC fingerprint analysis of a eucaryote, and have provided a model essential to efforts aimed at generating similar databases of fingerprint contigs to support sequencing of other complex genomes, including that of human.  相似文献   

8.
We constructed a tiling resolution array consisting of 32,433 overlapping BAC clones covering the entire human genome. This increases our ability to identify genetic alterations and their boundaries throughout the genome in a single comparative genomic hybridization (CGH) experiment. At this tiling resolution, we identified minute DNA alterations not previously reported. These alterations include microamplifications and deletions containing oncogenes, tumor-suppressor genes and new genes that may be associated with multiple tumor types. Our findings show the need to move beyond conventional marker-based genome comparison approaches, that rely on inference of continuity between interval markers. Our submegabase resolution tiling set for array CGH (SMRT array) allows comprehensive assessment of genomic integrity and thereby the identification of new genes associated with disease.  相似文献   

9.
10.
The plant Arabidopsis thaliana occurs naturally in many different habitats throughout Eurasia. As a foundation for identifying genetic variation contributing to adaptation to diverse environments, a 1001 Genomes Project to sequence geographically diverse A. thaliana strains has been initiated. Here we present the first phase of this project, based on population-scale sequencing of 80 strains drawn from eight regions throughout the species' native range. We describe the majority of common small-scale polymorphisms as well as many larger insertions and deletions in the A. thaliana pan-genome, their effects on gene function, and the patterns of local and global linkage among these variants. The action of processes other than spontaneous mutation is identified by comparing the spectrum of mutations that have accumulated since A. thaliana diverged from its closest relative 10 million years ago with the spectrum observed in the laboratory. Recent species-wide selective sweeps are rare, and potentially deleterious mutations are more common in marginal populations.  相似文献   

11.
Inversions, deletions and insertions are important mediators of disease and disease susceptibility. We systematically compared the human genome reference sequence with a second genome (represented by fosmid paired-end sequences) to detect intermediate-sized structural variants >8 kb in length. We identified 297 sites of structural variation: 139 insertions, 102 deletions and 56 inversion breakpoints. Using combined literature, sequence and experimental analyses, we validated 112 of the structural variants, including several that are of biomedical relevance. These data provide a fine-scale structural variation map of the human genome and the requisite sequence precision for subsequent genetic studies of human disease.  相似文献   

12.
The Escherichia coli gene recQ was identified as a RecF recombination pathway gene. The gene SGS1, encoding the only RecQ-like DNA helicase in Saccharomyces cerevisiae, was identified by mutations that suppress the top3 slow-growth phenotype. Relatively little is known about the function of Sgs1p because single mutations in SGS1 do not generally cause strong phenotypes. Mutations in genes encoding RecQ-like DNA helicases such as the Bloom and Werner syndrome genes, BLM and WRN, have been suggested to cause increased genome instability. But the exact DNA metabolic defect that might underlie such genome instability has remained unclear. To better understand the cellular role of the RecQ-like DNA helicases, sgs1 mutations were analyzed for their effect on genome rearrangements. Mutations in SGS1 increased the rate of accumulating gross chromosomal rearrangements (GCRs), including translocations and deletions containing extended regions of imperfect homology at their breakpoints. sgs1 mutations also increased the rate of recombination between DNA sequences that had 91% sequence homology. Epistasis analysis showed that Sgs1p is redundant with DNA mismatch repair (MMR) for suppressing GCRs and for suppressing recombination between divergent DNA sequences. This suggests that defects in the suppression of rearrangements involving divergent, repeated sequences may underlie the genome instability seen in BLM and WRN patients and in cancer cases associated with defects in these genes.  相似文献   

13.
The locations and properties of common deletion variants in the human genome are largely unknown. We describe a systematic method for using dense SNP genotype data to discover deletions and its application to data from the International HapMap Consortium to characterize and catalogue segregating deletion variants across the human genome. We identified 541 deletion variants (94% novel) ranging from 1 kb to 745 kb in size; 278 of these variants were observed in multiple, unrelated individuals, 120 in the homozygous state. The coding exons of ten expressed genes were found to be commonly deleted, including multiple genes with roles in sex steroid metabolism, olfaction and drug response. These common deletion polymorphisms typically represent ancestral mutations that are in linkage disequilibrium with nearby SNPs, meaning that their association to disease can often be evaluated in the course of SNP-based whole-genome association studies.  相似文献   

14.
Genome-wide mapping with biallelic markers in Arabidopsis thaliana.   总被引:17,自引:0,他引:17  
Single-nucleotide polymorphisms, as well as small insertions and deletions (here referred to collectively as simple nucleotide polymorphisms, or SNPs), comprise the largest set of sequence variants in most organisms. Positional cloning based on SNPs may accelerate the identification of human disease traits and a range of biologically informative mutations. The recent application of high-density oligonucleotide arrays to allele identification has made it feasible to genotype thousands of biallelic SNPs in a single experiment. It has yet to be established, however, whether SNP detection using oligonucleotide arrays can be used to accelerate the mapping of traits in diploid genomes. The cruciferous weed Arabidopsis thaliana is an attractive model system for the construction and use of biallelic SNP maps. Although important biological processes ranging from fertilization and cell fate determination to disease resistance have been modelled in A. thaliana, identifying mutations in this organism has been impeded by the lack of a high-density genetic map consisting of easily genotyped DNA markers. We report here the construction of a biallelic genetic map in A. thaliana with a resolution of 3.5 cM and its use in mapping Eds16, a gene involved in the defence response to the fungal pathogen Erysiphe orontii. Mapping of this trait involved the high-throughput generation of meiotic maps of F2 individuals using high-density oligonucleotide probe array-based genotyping. We developed a software package called InterMap and used it to automatically delimit Eds16 to a 7-cM interval on chromosome 1. These results are the first demonstration of biallelic mapping in diploid genomes and establish means for generalizing SNP-based maps to virtually any genetic organism.  相似文献   

15.
To test the hypothesis that the human genome project will uncover many genes not previously discovered by sequencing of expressed sequence tags (ESTs), we designed and produced a set of microarrays using probes based on open reading frames (ORFs) in 350 Mb of finished and draft human sequence. Our approach aims to identify all genes directly from genomic sequence by querying gene expression. We analysed genomic sequence with a suite of ORF prediction programs, selected approximately one ORF per gene, amplified the ORFs from genomic DNA and arrayed the amplicons onto treated glass slides. Of the first 10,000 arrayed ORFs, 31% are completely novel and 29% are similar, but not identical, to sequences in public databases. Approximately one-half of these are expressed in the tissues we queried by microarray. Subsequent verification by other techniques confirmed expression of several of the novel genes. Expressed sequence tags (ESTs) have yielded vast amounts of data, but our results indicate that many genes in the human genome will only be found by genomic sequencing.  相似文献   

16.
A high-resolution survey of deletion polymorphism in the human genome   总被引:20,自引:0,他引:20  
Recent work has shown that copy number polymorphism is an important class of genetic variation in human genomes. Here we report a new method that uses SNP genotype data from parent-offspring trios to identify polymorphic deletions. We applied this method to data from the International HapMap Project to produce the first high-resolution population surveys of deletion polymorphism. Approximately 100 of these deletions have been experimentally validated using comparative genome hybridization on tiling-resolution oligonucleotide microarrays. Our analysis identifies a total of 586 distinct regions that harbor deletion polymorphisms in one or more of the families. Notably, we estimate that typical individuals are hemizygous for roughly 30-50 deletions larger than 5 kb, totaling around 550-750 kb of euchromatic sequence across their genomes. The detected deletions span a total of 267 known and predicted genes. Overall, however, the deleted regions are relatively gene-poor, consistent with the action of purifying selection against deletions. Deletion polymorphisms may well have an important role in the genetics of complex traits; however, they are not directly observed in most current gene mapping studies. Our new method will permit the identification of deletion polymorphisms in high-density SNP surveys of trio or other family data.  相似文献   

17.
Numerous types of DNA variation exist, ranging from SNPs to larger structural alterations such as copy number variants (CNVs) and inversions. Alignment of DNA sequence from different sources has been used to identify SNPs and intermediate-sized variants (ISVs). However, only a small proportion of total heterogeneity is characterized, and little is known of the characteristics of most smaller-sized (<50 kb) variants. Here we show that genome assembly comparison is a robust approach for identification of all classes of genetic variation. Through comparison of two human assemblies (Celera's R27c compilation and the Build 35 reference sequence), we identified megabases of sequence (in the form of 13,534 putative non-SNP events) that were absent, inverted or polymorphic in one assembly. Database comparison and laboratory experimentation further demonstrated overlap or validation for 240 variable regions and confirmed >1.5 million SNPs. Some differences were simple insertions and deletions, but in regions containing CNVs, segmental duplication and repetitive DNA, they were more complex. Our results uncover substantial undescribed variation in humans, highlighting the need for comprehensive annotation strategies to fully interpret genome scanning and personalized sequencing projects.  相似文献   

18.
We report a recurrent microdeletion syndrome causing mental retardation, epilepsy and variable facial and digital dysmorphisms. We describe nine affected individuals, including six probands: two with de novo deletions, two who inherited the deletion from an affected parent and two with unknown inheritance. The proximal breakpoint of the largest deletion is contiguous with breakpoint 3 (BP3) of the Prader-Willi and Angelman syndrome region, extending 3.95 Mb distally to BP5. A smaller 1.5-Mb deletion has a proximal breakpoint within the larger deletion (BP4) and shares the same distal BP5. This recurrent 1.5-Mb deletion contains six genes, including a candidate gene for epilepsy (CHRNA7) that is probably responsible for the observed seizure phenotype. The BP4-BP5 region undergoes frequent inversion, suggesting a possible link between this inversion polymorphism and recurrent deletion. The frequency of these microdeletions in mental retardation cases is approximately 0.3% (6/2,082 tested), a prevalence comparable to that of Williams, Angelman and Prader-Willi syndromes.  相似文献   

19.
The approach to annotating a genome critically affects the number and accuracy of genes identified in the genome sequence. Genome annotation based on stringent gene identification is prone to underestimate the complement of genes encoded in a genome. In contrast, over-prediction of putative genes followed by exhaustive computational sequence, motif and structural homology search will find rarely expressed, possibly unique, new genes at the risk of including non-functional genes. We developed a two-stage approach that combines the merits of stringent genome annotation with the benefits of over-prediction. First we identify plausible genes regardless of matches with EST, cDNA or protein sequences from the organism (stage 1). In the second stage, proteins predicted from the plausible genes are compared at the protein level with EST, cDNA and protein sequences, and protein structures from other organisms (stage 2). Remote but biologically meaningful protein sequence or structure homologies provide supporting evidence for genuine genes. The method, applied to the Drosophila melanogaster genome, validated 1,042 novel candidate genes after filtering 19,410 plausible genes, of which 12,124 matched the original 13,601 annotated genes. This annotation strategy is applicable to genomes of all organisms, including human.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号