首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Substantial efforts are focused on identifying single-nucleotide polymorphisms (SNPs) throughout the human genome, particularly in coding regions (cSNPs), for both linkage disequilibrium and association studies. Less attention, however, has been directed to the clarification of evolutionary processes that are responsible for the variability in nucleotide diversity among different regions of the genome. We report here the population sequence diversity of genomic segments within a 450-kb cluster of olfactory receptor (OR) genes on human chromosome 17. We found a dichotomy in the pattern of nucleotide diversity between OR pseudogenes and introns on the one hand and the closely interspersed intact genes on the other. We suggest that weak positive selection is responsible for the observed patterns of genetic variation. This is inferred from a lower ratio of polymorphism to divergence in genes compared with pseudogenes or introns, high non-synonymous substitution rates in OR genes, and a small but significant overall reduction in variability in the entire OR gene cluster compared with other genomic regions. The dichotomy among functionally different segments within a short genomic distance requires high recombination rates within this OR cluster. Our work demonstrates the impact of weak positive selection on human nucleotide diversity, and has implications for the evolution of the olfactory repertoire.  相似文献   

2.
Analysis of expressed sequence tags indicates 35,000 human genes   总被引:18,自引:0,他引:18  
Ewing B  Green P 《Nature genetics》2000,25(2):232-234
The number of protein-coding genes in an organism provides a useful first measure of its molecular complexity. Single-celled prokaryotes and eukaryotes typically have a few thousand genes; for example, Escherichia coli has 4,300 and Saccharomyces cerevisiae has 6,000. Evolution of multicellularity appears to have been accompanied by a several-fold increase in gene number, the invertebrates Caenorhabditis elegans and Drosophila melanogaster having 19,000 and 13,600 genes, respectively. Here we estimate the number of human genes by comparing a set of human expressed sequence tag (EST) contigs with human chromosome 22 and with a non-redundant set of mRNA sequences. The two comparisons give mutually consistent estimates of approximately 35,000 genes, substantially lower than most previous estimates. Evolution of the increased physiological complexity of vertebrates may therefore have depended more on the combinatorial diversification of regulatory networks or alternative splicing than on a substantial increase in gene number.  相似文献   

3.
The number of genes in the human genome is unknown, with estimates ranging from 50,000 to 90,000 (refs 1, 2), and to more than 140,000 according to unpublished sources. We have developed 'Exofish', a procedure based on homology searches, to identify human genes quickly and reliably. This method relies on the sequence of another vertebrate, the pufferfish Tetraodon nigroviridis, to detect conserved sequences with a very low background. Similar to Fugu rubripes, a marine pufferfish proposed by Brenner et al. as a model for genomic studies, T. nigroviridis is a more practical alternative with a genome also eight times more compact than that of human. Many comparisons have been made between F. rubripes and human DNA that demonstrate the potential of comparative genomics using the pufferfish genome. Application of Exofish to the December version of the working draft sequence of the human genome and to Unigene showed that the human genome contains 28,000-34,000 genes, and that Unigene contains less than 40% of the protein-coding fraction of the human genome.  相似文献   

4.
5.
Tandemly repeated DNA sequences are highly dynamic components of genomes. Most repeats are in intergenic regions, but some are in coding sequences or pseudogenes. In humans, expansion of intragenic triplet repeats is associated with various diseases, including Huntington chorea and fragile X syndrome. The persistence of intragenic repeats in genomes suggests that there is a compensating benefit. Here we show that in the genome of Saccharomyces cerevisiae, most genes containing intragenic repeats encode cell-wall proteins. The repeats trigger frequent recombination events in the gene or between the gene and a pseudogene, causing expansion and contraction in the gene size. This size variation creates quantitative alterations in phenotypes (e.g., adhesion, flocculation or biofilm formation). We propose that variation in intragenic repeat number provides the functional diversity of cell surface antigens that, in fungi and other pathogens, allows rapid adaptation to the environment and elusion of the host immune system.  相似文献   

6.
Most human sequence variation is in the form of single-nucleotide polymorphisms (SNPs). It has been proposed that coding-region SNPs (cSNPs) be used for direct association studies to determine the genetic basis of complex traits. The success of such studies depends on the frequency of disease-associated alleles, and their distribution in different ethnic populations. If disease-associated alleles are frequent in most populations, then direct genotyping of candidate variants could show robust associations in manageable study samples. This approach is less feasible if the genetic risk from a given candidate gene is due to many infrequent alleles. Previous studies of several genes demonstrated that most variants are relatively infrequent (<0.05). These surveys genotyped small samples (n<75) and thus had limited ability to identify rare alleles. Here we evaluate the prevalence and distribution of such rare alleles by genotyping an ethnically diverse reference sample that is more than six times larger than those used in previous studies (n=450). We screened for variants in the complete coding sequence and intron-exon junctions of two candidate genes for neuropsychiatric phenotypes: SLC6A4, encoding the serotonin transporter; and SLC18A2, encoding the vesicular monoamine transporter. Both genes have unique roles in neuronal transmission, and variants in either gene might be associated with neurobehavioral phenotypes.  相似文献   

7.
Retroviral insertional mutagenesis in BXH2 and AKXD recombinant inbred mice induces a high incidence of myeloid or B- and T-cell leukaemia and the proviral integration sites in the leukaemias provide powerful genetic tags for disease gene identification. Some of the disease genes identified by proviral tagging are also associated with human disease, validating this approach for human disease gene identification. Although many leukaemia disease genes have been identified over the years, many more remain to be cloned. Here we describe an inverse PCR (IPCR) method for proviral tagging that makes use of automated DNA sequencing and the genetic tools provided by the Mouse Genome Project, which increases the throughput for disease gene identification. We also use this IPCR method to clone and analyse more than 400 proviral integration sites from AKXD and BXH2 leukaemias and, in the process, identify more than 90 candidate disease genes. Some of these genes function in pathways already implicated in leukaemia, whereas others are likely to define new disease pathways. Our studies underscore the power of the mouse as a tool for gene discovery and functional genomics.  相似文献   

8.
Large scale sequencing of cDNAs provides a complementary approach to structural analysis of the human genome by generating expressed sequence tags (ESTs). We have initiated the large-scale sequencing of a 3'-directed cDNA library from the human liver cell line HepG2, that is a non-biased representation of the mRNA population. 982 random cDNA clones were sequenced yielding more than 270 kilobases. A significant portion of the identified genes encoded secretable proteins and components for protein-synthesis. The abundance of cDNA species varied from 2.2% to less than 0.004%. Fifty two percent of the mRNA were abundant species consisting of 173 genes and the rest were non-abundant, consisting of about 6,600 genes.  相似文献   

9.
10.
Rossi JJ 《Nature genetics》2011,43(4):288-289
MicroRNAs (miRNAs) regulate expression of more than one half of the genes in the human genome. A study now reports a new method for selectively silencing whole families of miRNAs, thus providing a new paradigm for disease therapy.  相似文献   

11.
12.
A major goal in human genetics is to understand the role of common genetic variants in susceptibility to common diseases. This will require characterizing the nature of gene variation in human populations, assembling an extensive catalogue of single-nucleotide polymorphisms (SNPs) in candidate genes and performing association studies for particular diseases. At present, our knowledge of human gene variation remains rudimentary. Here we describe a systematic survey of SNPs in the coding regions of human genes. We identified SNPs in 106 genes relevant to cardiovascular disease, endocrinology and neuropsychiatry by screening an average of 114 independent alleles using 2 independent screening methods. To ensure high accuracy, all reported SNPs were confirmed by DNA sequencing. We identified 560 SNPs, including 392 coding-region SNPs (cSNPs) divided roughly equally between those causing synonymous and non-synonymous changes. We observed different rates of polymorphism among classes of sites within genes (non-coding, degenerate and non-degenerate) as well as between genes. The cSNPs most likely to influence disease, those that alter the amino acid sequence of the encoded protein, are found at a lower rate and with lower allele frequencies than silent substitutions. This likely reflects selection acting against deleterious alleles during human evolution. The lower allele frequency of missense cSNPs has implications for the compilation of a comprehensive catalogue, as well as for the subsequent application to disease association.  相似文献   

13.
Notch and the m9/10 gene (groucho) of the Enhancer of split (E(spI)) complex are members of the "Notch group" of genes, which is required for a variety of cell fate choices in Drosophila. We have characterized human cDNA clones encoding a family of proteins, designated TLE, that are homologous to the E(spI) m9/10 gene product, as well as a novel Notch-related protein. The TLE genes are differentially expressed and encode nuclear proteins, consistent with the presence of sequence motifs associated with nuclear functions. The structural redundancy implied by the existence of more than one TLE and Notch-homologous gene may be a feature of the human counterparts of the developmentally important Drosophila Notch group genes.  相似文献   

14.
Cloning procedures aided by homology searches of EST databases have accelerated the pace of discovery of new genes, but EST database searching remains an involved and onerous task. More than 1.6 million human EST sequences have been deposited in public databases, making it difficult to identify ESTs that represent new genes. Compounding the problems of scale are difficulties in detection associated with a high sequencing error rate and low sequence similarity between distant homologues. We have developed a new method, coupling BLAST-based searches with a domain identification protocol, that filters candidate homologues. Application of this method in a large-scale analysis of 100 signalling domain families has led to the identification of ESTs representing more than 1,000 novel human signalling genes. The 4,206 publicly available ESTs representing these genes are a valuable resource for rapid cloning of novel human signalling proteins. For example, we were able to identify ESTs of at least 106 new small GTPases, of which 6 are likely to belong to new subfamilies. In some cases, further analyses of genomic DNA led to the discovery of previously unidentified full-length protein sequences. This is exemplified by the in silico cloning (prediction of a gene product sequence using only genomic and EST sequence data) of a new type of GTPase with two catalytic domains.  相似文献   

15.
The nature of synthetic genetic interactions involving essential genes (those required for viability) has not been previously examined in a broad and unbiased manner. We crossed yeast strains carrying promoter-replacement alleles for more than half of all essential yeast genes to a panel of 30 different mutants with defects in diverse cellular processes. The resulting genetic network is biased toward interactions between functionally related genes, enabling identification of a previously uncharacterized essential gene (PGA1) required for specific functions of the endoplasmic reticulum. But there are also many interactions between genes with dissimilar functions, suggesting that individual essential genes are required for buffering many cellular processes. The most notable feature of the essential synthetic genetic network is that it has an interaction density five times that of nonessential synthetic genetic networks, indicating that most yeast genetic interactions involve at least one essential gene.  相似文献   

16.
Analysis of the coding genome of diffuse large B-cell lymphoma   总被引:1,自引:0,他引:1  
Diffuse large B-cell lymphoma (DLBCL) is the most common form of human lymphoma. Although a number of structural alterations have been associated with the pathogenesis of this malignancy, the full spectrum of genetic lesions that are present in the DLBCL genome, and therefore the identity of dysregulated cellular pathways, remains unknown. By combining next-generation sequencing and copy number analysis, we show that the DLBCL coding genome contains, on average, more than 30 clonally represented gene alterations per case. This analysis also revealed mutations in genes not previously implicated in DLBCL pathogenesis, including those regulating chromatin methylation (MLL2; 24% of samples) and immune recognition by T cells. These results provide initial data on the complexity of the DLBCL coding genome and identify novel dysregulated pathways underlying its pathogenesis.  相似文献   

17.
New genes involved in cancer identified by retroviral tagging   总被引:21,自引:0,他引:21  
Retroviral insertional mutagenesis in BXH2 and AKXD mice induces a high incidence of myeloid leukemia and B- and T-cell lymphoma, respectively. The retroviral integration sites (RISs) in these tumors thus provide powerful genetic tags for the discovery of genes involved in cancer. Here we report the first large-scale use of retroviral tagging for cancer gene discovery in the post-genome era. Using high throughput inverse PCR, we cloned and analyzed the sequences of 884 RISs from a tumor panel composed primarily of B-cell lymphomas. We then compared these sequences, and another 415 RIS sequences previously cloned from BXH2 myeloid leukemias and from a few AKXD lymphomas, against the recently assembled mouse genome sequence. These studies identified 152 loci that are targets of retroviral integration in more than one tumor (common retroviral integration sites, CISs) and therefore likely to encode a cancer gene. Thirty-six CISs encode genes that are known or predicted to be genes involved in human cancer or their homologs, whereas others encode candidate genes that have not yet been examined for a role in human cancer. Our studies demonstrate the power of retroviral tagging for cancer gene discovery in the post-genome era and indicate a largely unrecognized complexity in mouse and presumably human cancer.  相似文献   

18.
Single-nucleotide polymorphisms (SNPs) have been explored as a high-resolution marker set for accelerating the mapping of disease genes. Here we report 48,196 candidate SNPs detected by statistical analysis of human expressed sequence tags (ESTs), associated primarily with coding regions of genes. We used Bayesian inference to weigh evidence for true polymorphism versus sequencing error, misalignment or ambiguity, misclustering or chimaeric EST sequences, assessing data such as raw chromatogram height, sharpness, overlap and spacing, sequencing error rates, context-sensitivity and cDNA library origin. Three separate validations-comparison with 54 genes screened for SNPs independently, verification of HLA-A polymorphisms and restriction fragment length polymorphism (RFLP) testing-verified 70%, 89% and 71% of our predicted SNPs, respectively. Our method detects tenfold more true HLA-A SNPs than previous analyses of the EST data. We found SNPs in a large fraction of known disease genes, including some disease-causing mutations (for example, the HbS sickle-cell mutation). Our comprehensive analysis of human coding region polymorphism provides a public resource for mapping of disease genes (available at http://www.bioinformatics.ucla.edu/snp).  相似文献   

19.
A transcriptomic analysis of the phylum Nematoda   总被引:1,自引:0,他引:1  
The phylum Nematoda occupies a huge range of ecological niches, from free-living microbivores to human parasites. We analyzed the genomic biology of the phylum using 265,494 expressed-sequence tag sequences, corresponding to 93,645 putative genes, from 30 species, including 28 parasites. From 35% to 70% of each species' genes had significant similarity to proteins from the model nematode Caenorhabditis elegans. More than half of the putative genes were unique to the phylum, and 23% were unique to the species from which they were derived. We have not yet come close to exhausting the genomic diversity of the phylum. We identified more than 2,600 different known protein domains, some of which had differential abundances between major taxonomic groups of nematodes. We also defined 4,228 nematode-specific protein families from nematode-restricted genes: this class of genes probably underpins species- and higher-level taxonomic disparity. Nematode-specific families are particularly interesting as drug and vaccine targets.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号