Geneexpress: A Computer System for Description, Analysis, and Recognition of Regulatory Sequences in (original) (raw)

GeneExpress: a computer system for description, analysis, and recognition of regulatory sequences in eukaryotic genome

Proceedings. International Conference on Intelligent Systems for Molecular Biology, 1998

GeneExpress system has been designed to integrate description, analysis, and recognition of eukaryotic regulatory sequences. The system includes 5 basic units: (1) GeneNet contains an object-oriented database for accumulation of data on gene networks and signal transduction pathways and a Java-based viewer that allows an exploration and visualization of the GeneNet information; (2) Transcription Regulation combines the database on transcription regulatory regions of eukaryotic genes (TRRD) and TRRD Viewer; (3) Transcription Factor Binding Site Recognition contains a compilation of transcription factor binding sites (TFBSC) and programs for their analysis and recognition; (4) mRNA Translation is designed for analysis of structural and contextual features of mRNA 5'UTRs and prediction of their translation efficiency; and (5) ACTIVITY is the module for analysis and site activity prediction of a given nucleotide sequence. Integration of the databases in the GeneExpress is based on t...

GenExpress: A Computer System for Description, Analysis and Recognition of Regulatory Sequences in Eukaryotic Genome

1998

GeneExpress system has been designed to integrate description, analysis, and recognition of eukaryotic regulatory sequences. The system includes 5 basic units: (1) GeneNet contains an object-oriented database for accumulation of data on gene networks and signal transduction pathways and a Java-based viewer that allows an exploration and visualization of the GeneNet information; (2) Transcription Regulation combines the database on transcription regulatory regions of eukaryotic genes (TRRD) and TRRD Viewer; (3) Transcription Factor Binding Site Recognition contains a compilation of transcription factor binding sites (TFBSC) and programs for their analysis and recognition; (4) mRNA Translation is designed for analysis of structural and contextual features of mRNA 5'UTRs and prediction of their translation efficiency; and (5) ACTIVITY is the module for analysis and site activity prediction of a given nucleotide sequence. Integration of the databases in the GeneExpress is based on the Sequence Retrieval System (SRS) created in the European Bioinformatics Institute.

Computer Tool FUNSITE for Analysis of Eukaryotic Regulatory Genomic Sequences

1995

We present the computer tool FunSite for description and analysis of regulatory sequences of eukaryotic genomes. The tool consists of the following main parts: 1) An integrated database for genomic regulatory sequences. The integrated database was designed on the basis of the databases TRANSFAC [1] and TRRD [2] that are currently under development. The following functions are performed: i) linkage to the EMBL database; ii) preparing samples of definite types of functional sites with their flanking sequences; iii) preparing samples of promoter sequences; iv) preparing samples of transcription factors classified with regard to structural and functional features of DNA binding and activating domains, functional families of the factors, their tissue specificity and other functional features; v) access to data on mutual disposition of cis-elements within the regulatory regions. 2) The second component of FunSite tool is the set of programs for analysis of the structural organization of regulatory sequences: i) Program for revealing of potential transcription factors binding sites based on their consensi; ii) program for revealing of the potential binding sites using homology search with nucleotide sequences of real binding sites; iii) program for analysis of oligonucleotide context features which are characteristic of flank sequences of the binding sites; iv) program for analysis of correlations between functional site positions; v) program for design of recognition method for the functional sites based on generalized weight matrix; vi) program for revealing potential composite elements. The results of analysis of the promoter sequences of eukaryotic genes with the FunSite are presented, too.

Integrated databases and computer systems for studying eukaryotic gene expression

Bioinformatics/computer Applications in The Biosciences, 1999

The goal of the work was to develop a WWW-oriented computer system providing a maximal integration of informational and software resources on the regulation of gene expression and navigation through them. Rapid growth of the variety and volume of information accumulated in the databases on regulation of gene expression necessarily requires the development of computer systems for automated discovery of the knowledge that can be further used for analysis of regulatory genomic sequences. Results: The GeneExpress system developed includes the following major informational and software systems for detecting conservative contextual regions of functional sites and their recognition; (4) Gene Networks (GeneNet), which contains an object-oriented database accumulating the data on gene networks and signal transduction pathways, and the Java-based Viewer for exploration and visualization of the GeneNet information; mRNA Translation (Leader mRNA), designed to analyze structural and contextual properties of mRNA 5′-untranslated regions (5′-UTRs) and predict their translation efficiency; (6) other program modules designed to study the structure-function organization of regulatory genomic sequences and regulatory proteins. Availability: GeneExpress is available at http://wwwmgs. bionet.nsc.ru/systems/GeneExpress/ and the links to the mirror site(s) can be found at http://wwwmgs.bionet.nsc.ru/ mgs/links/mirrors.html Contact: kol@bionet.nsc.ru Vol. 15 nos 7/8 1999 Pages 669-686 669 E Oxford University Press 1999 BIOINFORMATICS N.A.Kolchanov et al.

TRRD and COMPEL databases on transcription linked to TRANSFACAS as tools for analysis and recognition of regulatory sequences

Lecture Notes in Computer Science, 1996

Two new databases have been developed to provide the comprehensive research on mechanisms controlling eukaryotic gene expression on the transcription level: TRRD (Transcription Regulatory Region Database) for accumulation of data about structural and functional organisation of gene regulatory regions, and COMPEL -the database on composite regulatory elements that contains contiguous or overlapping binding sites for different transcription factors. Link between TRRD, COMPEL and TRANSFAC through the common table "GENES" has been established. Computer analysis of the transcription regulatory sequences collected in the databases have been carried out by means of SITEVIDEO system. SITEVIDEO offers the following programs: search for significant oligonucleotides in a 15letter code; dinucleotide weight consensus; analysis of DNA conformation parameters.

Transcription Regulatory Regions Database (TRRD): its status in 1999

Nucleic Acids Research, 1999

The Transcription Regulatory Regions Database (TRRD) is a curated database designed for accumulation of experimental data on extended regulatory regions of eukaryotic genes, the regulatory elements they contain, i.e., transcription factor binding sites, promoters, enhancers, silencers, etc., and expression patterns of the genes. Release 4.1 of TRRD offers a number of significant improvements, in particular, a more detailed description of transcription factor binding sites, transcription factors per se, and gene expression patterns in a computer-readable format. In addition, the new TRRD release provides considerably more references to other molecular biological databases. TRRD 4.1 is installed under SRS and is available through the WWW at http://www.bionet.nsc.ru/trrd/

TRANSFAC®: transcriptional regulation, from patterns to profiles

Nucleic acids …, 2003

The TRANSFAC 1 database on eukaryotic transcriptional regulation, comprising data on transcription factors, their target genes and regulatory binding sites, has been extended and further developed, both in number of entries and in the scope and structure of the collected data. Structured fields for expression patterns have been introduced for transcription factors from human and mouse, using the CYTOMER 1 database on anatomical structures and developmental stages. The functionality of Match TM , a tool for matrix-based search of transcription factor binding sites, has been enhanced. For instance, the program now comes along with a number of tissue-(or state-)specific profiles and new profiles can be created and modified with Match TM Profiler. The GENE table was extended and gained in importance, containing amongst others links to LocusLink, RefSeq and OMIM now. Further, (direct) links between factor and target gene on one hand and between gene and encoded factor on the other hand were introduced. The TRANSFAC 1 public release is available at http://www.gene-regulation.com. For yeast an additional release including the latest data was made available separately as TRANSFAC 1 Saccharomyces Module (TSM) at http://transfac. gbf.de. For CYTOMER 1 free download versions are available at http://www.biobase.de:8080/index.html.

TranScout: prediction of gene expression regulatory proteins from their sequences

Bioinformatics/computer Applications in The Biosciences, 2002

The advent of genomics yields thousands of reading frames in search of function. Identification of conserved functional motifs in protein sequences can be helpful for function prediction. Results: A database and a classification of reported DNAbinding protein motifs has been designed. A program ('TranScout') has been developed for the detection and evaluation of conserved motifs in prokaryotic and eukaryotic sequences of proteins with a gene regulatory function. The efficiency of the program is shown in a benchmark against a database obtained from SWISS-PROT without the protein sequences used to train the program. All motifs were detected with a mean average sensitivity of 0.98 and a mean average specificity of 0.92. Availability: The program is freely available for use on the internet at http://luz.uab.es/transcout/. The user can find additional information at this site.

MoD Tools: regulatory motif discovery in nucleotide sequences from co-regulated or homologous genes

Nucleic Acids Research, 2006

Understanding the complex mechanisms regulating gene expression at the transcriptional and posttranscriptional levels is one of the greatest challenges of the post-genomic era. The MoD (MOtif Discovery) Tools web server comprises a set of tools for the discovery of novel conserved sequence and structure motifs in nucleotide sequences, motifs that in turn are good candidates for regulatory activity. The server includes the following programs: Weeder, for the discovery of conserved transcription factor binding sites (TFBSs) in nucleotide sequences from co-regulated genes; WeederH, for the discovery of conserved TFBSs and distal regulatory modules in sequences from homologous genes; RNAProfile, for the discovery of conserved secondary structure motifs in unaligned RNA sequences whose secondary structure is not known. In this way, a given gene can be compared with other co-regulated genes or with its homologs, or its mRNA can be analyzed for conserved motifs regulating its post-transcriptional fate. The web server thus provides researchers with different strategies and methods to investigate the regulation of gene expression, at both the transcriptional and post-transcriptional levels. Available at

The PAZAR database of gene regulatory information coupled to the ORCA toolkit for the study of regulatory sequences

Nucleic Acids Research, 2009

The PAZAR database unites independently created and maintained data collections of transcription factor and regulatory sequence annotation. The flexible PAZAR schema permits the representation of diverse information derived from experiments ranging from biochemical protein-DNA binding to cellular reporter gene assays. Data collections can be made available to the public, or restricted to specific system users. The data 'boutiques' within the shopping-mall-inspired system facilitate the analysis of genomics data and the creation of predictive models of gene regulation. Since its initial release, PAZAR has grown in terms of data, features and through the addition of an associated package of software tools called the ORCA toolkit (ORCAtk). ORCAtk allows users to rapidly develop analyses based on the information stored in the PAZAR system. PAZAR is available at http://www.pazar. info. ORCAtk can be accessed through convenient buttons located in the PAZAR pages or via our website at