Identification of regulatory elements using a feature selection method - PubMed (original) (raw)
Identification of regulatory elements using a feature selection method
Sündüz Keleş et al. Bioinformatics. 2002 Sep.
Abstract
Motivation: Many methods have been described to identify regulatory motifs in the transcription control regions of genes that exhibit similar patterns of gene expression across a variety of experimental conditions. Here we focus on a single experimental condition, and utilize gene expression data to identify sequence motifs associated with genes that are activated under this experimental condition. We use a linear model with two-way interactions to model gene expression as a function of sequence features (words) present in presumptive transcription control regions. The most relevant features are selected by a feature selection method called stepwise selection with monte carlo cross validation. We apply this method to a publicly available dataset of the yeast Saccharomyces cerevisiae, focussing on the 800 basepairs immediately upstream of each gene's translation start site (the upstream control region (UCR)).
Results: We successfully identify regulatory motifs that are known to be active under the experimental conditions analyzed, and find additional significant sequences that may represent novel regulatory motifs. We also discuss a complementary method that utilizes gene expression data from a single microarray experiment and allows averaging over variety of experimental conditions as an alternative to motif finding methods that act on clusters of co-expressed genes.
Availability: The software is available upon request from the first author or may be downloaded from http://www.stat.berkeley.edu/\~sunduz.
Contact: keles@stat.berkeley.edu
Similar articles
- Regulatory motif finding by logic regression.
Keles S, van der Laan MJ, Vulpe C. Keles S, et al. Bioinformatics. 2004 Nov 1;20(16):2799-811. doi: 10.1093/bioinformatics/bth333. Epub 2004 May 27. Bioinformatics. 2004. PMID: 15166027 - Identification of DNA regulatory motifs using Bayesian variable selection.
Tadesse MG, Vannucci M, Liò P. Tadesse MG, et al. Bioinformatics. 2004 Nov 1;20(16):2553-61. doi: 10.1093/bioinformatics/bth282. Epub 2004 Apr 29. Bioinformatics. 2004. PMID: 15117754 - Efficiently finding regulatory elements using correlation with gene expression.
Bannai H, Inenaga S, Shinohara A, Takeda M, Miyano S. Bannai H, et al. J Bioinform Comput Biol. 2004 Jun;2(2):273-88. doi: 10.1142/s0219720004000612. J Bioinform Comput Biol. 2004. PMID: 15297982 - CLICK and EXPANDER: a system for clustering and visualizing gene expression data.
Sharan R, Maron-Katz A, Shamir R. Sharan R, et al. Bioinformatics. 2003 Sep 22;19(14):1787-99. doi: 10.1093/bioinformatics/btg232. Bioinformatics. 2003. PMID: 14512350 - Regulatory sequence analysis: application to the interpretation of gene expression.
Vilo J, Kivinen K. Vilo J, et al. Eur Neuropsychopharmacol. 2001 Dec;11(6):399-411. doi: 10.1016/s0924-977x(01)00117-1. Eur Neuropsychopharmacol. 2001. PMID: 11704417 Review.
Cited by
- Multi-Attribute Subset Selection enables prediction of representative phenotypes across microbial populations.
Herbst K, Wang T, Forchielli EJ, Thommes M, Paschalidis IC, Segrè D. Herbst K, et al. Commun Biol. 2024 Apr 3;7(1):407. doi: 10.1038/s42003-024-06093-w. Commun Biol. 2024. PMID: 38570615 Free PMC article. - The spatial binding model of the pioneer factor Oct4 with its target genes during cell reprogramming.
Li H, Ta N, Long C, Zhang Q, Li S, Liu S, Yang L, Zuo Y. Li H, et al. Comput Struct Biotechnol J. 2019 Sep 11;17:1226-1233. doi: 10.1016/j.csbj.2019.09.002. eCollection 2019. Comput Struct Biotechnol J. 2019. PMID: 31921389 Free PMC article. - Additive functions in boolean models of gene regulatory network modules.
Darabos C, Di Cunto F, Tomassini M, Moore JH, Provero P, Giacobini M. Darabos C, et al. PLoS One. 2011;6(11):e25110. doi: 10.1371/journal.pone.0025110. Epub 2011 Nov 21. PLoS One. 2011. PMID: 22132067 Free PMC article. - Reconstruction of the regulatory network of Lactobacillus plantarum WCFS1 on basis of correlated gene expression and conserved regulatory motifs.
Wels M, Overmars L, Francke C, Kleerebezem M, Siezen RJ. Wels M, et al. Microb Biotechnol. 2011 May;4(3):333-44. doi: 10.1111/j.1751-7915.2010.00217.x. Epub 2010 Oct 26. Microb Biotechnol. 2011. PMID: 21375715 Free PMC article. - Predicting gene expression in T cell differentiation from histone modifications and transcription factor binding affinities by linear mixture models.
Costa IG, Roider HG, do Rego TG, de Carvalho Fde A. Costa IG, et al. BMC Bioinformatics. 2011 Feb 15;12 Suppl 1(Suppl 1):S29. doi: 10.1186/1471-2105-12-S1-S29. BMC Bioinformatics. 2011. PMID: 21342559 Free PMC article.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Molecular Biology Databases