FIMO: scanning for occurrences of a given motif - PubMed (original) (raw)
FIMO: scanning for occurrences of a given motif
Charles E Grant et al. Bioinformatics. 2011.
Abstract
A motif is a short DNA or protein sequence that contributes to the biological function of the sequence in which it resides. Over the past several decades, many computational methods have been described for identifying, characterizing and searching with sequence motifs. Critical to nearly any motif-based sequence analysis pipeline is the ability to scan a sequence database for occurrences of a given motif described by a position-specific frequency matrix.
Results: We describe Find Individual Motif Occurrences (FIMO), a software tool for scanning DNA or protein sequences with motifs described as position-specific scoring matrices. The program computes a log-likelihood ratio score for each position in a given sequence database, uses established dynamic programming methods to convert this score to a P-value and then applies false discovery rate analysis to estimate a q-value for each position in the given sequence. FIMO provides output in a variety of formats, including HTML, XML and several Santa Cruz Genome Browser formats. The program is efficient, allowing for the scanning of DNA sequences at a rate of 3.5 Mb/s on a single CPU.
Availability and implementation: FIMO is part of the MEME Suite software toolkit. A web server and source code are available at http://meme.sdsc.edu.
Figures
Fig. 1.
Using FIMO to identify candidate CTCF binding sites in the human genome. (A) Sample FIMO HTML output, showing the locations of the top-scoring occurrences of the CTCF motif in the human genome. (B) A precision-recall curve created by comparing FIMO's ranked list of CTCF sites with a gold standard derived from a ChIP-seq experiment.
Similar articles
- MEME SUITE: tools for motif discovery and searching.
Bailey TL, Boden M, Buske FA, Frith M, Grant CE, Clementi L, Ren J, Li WW, Noble WS. Bailey TL, et al. Nucleic Acids Res. 2009 Jul;37(Web Server issue):W202-8. doi: 10.1093/nar/gkp335. Epub 2009 May 20. Nucleic Acids Res. 2009. PMID: 19458158 Free PMC article. - MEME: discovering and analyzing DNA and protein sequence motifs.
Bailey TL, Williams N, Misleh C, Li WW. Bailey TL, et al. Nucleic Acids Res. 2006 Jul 1;34(Web Server issue):W369-73. doi: 10.1093/nar/gkl198. Nucleic Acids Res. 2006. PMID: 16845028 Free PMC article. - rMotifGen: random motif generator for DNA and protein sequences.
Rouchka EC, Hardin CT. Rouchka EC, et al. BMC Bioinformatics. 2007 Aug 7;8:292. doi: 10.1186/1471-2105-8-292. BMC Bioinformatics. 2007. PMID: 17683637 Free PMC article. - Epigenetic priors for identifying active transcription factor binding sites.
Cuellar-Partida G, Buske FA, McLeay RC, Whitington T, Noble WS, Bailey TL. Cuellar-Partida G, et al. Bioinformatics. 2012 Jan 1;28(1):56-62. doi: 10.1093/bioinformatics/btr614. Epub 2011 Nov 8. Bioinformatics. 2012. PMID: 22072382 Free PMC article. - Discovering sequence motifs.
Bailey TL. Bailey TL. Methods Mol Biol. 2008;452:231-51. doi: 10.1007/978-1-60327-159-2_12. Methods Mol Biol. 2008. PMID: 18566768 Review.
Cited by
- Mapping the human miRNA interactome by CLASH reveals frequent noncanonical binding.
Helwak A, Kudla G, Dudnakova T, Tollervey D. Helwak A, et al. Cell. 2013 Apr 25;153(3):654-65. doi: 10.1016/j.cell.2013.03.043. Cell. 2013. PMID: 23622248 Free PMC article. - PEDLA: predicting enhancers with a deep learning-based algorithmic framework.
Liu F, Li H, Ren C, Bo X, Shu W. Liu F, et al. Sci Rep. 2016 Jun 22;6:28517. doi: 10.1038/srep28517. Sci Rep. 2016. PMID: 27329130 Free PMC article. - Exposure to hypoxia causes stress erythropoiesis and downregulates immune response genes in spleen of mice.
Wang H, Liu D, Song P, Jiang F, Chi X, Zhang T. Wang H, et al. BMC Genomics. 2021 Jun 5;22(1):413. doi: 10.1186/s12864-021-07731-x. BMC Genomics. 2021. PMID: 34090336 Free PMC article. - Genome-wide in silico prediction of gene expression.
McLeay RC, Lesluyes T, Cuellar Partida G, Bailey TL. McLeay RC, et al. Bioinformatics. 2012 Nov 1;28(21):2789-96. doi: 10.1093/bioinformatics/bts529. Epub 2012 Sep 6. Bioinformatics. 2012. PMID: 22954627 Free PMC article. - TAMC: A deep-learning approach to predict motif-centric transcriptional factor binding activity based on ATAC-seq profile.
Yang T, Henao R. Yang T, et al. PLoS Comput Biol. 2022 Sep 12;18(9):e1009921. doi: 10.1371/journal.pcbi.1009921. eCollection 2022 Sep. PLoS Comput Biol. 2022. PMID: 36094959 Free PMC article.
References
- Bailey T.L., Noble W.S. Searching for statistically significant regulatory modules. Bioinformatics. 2003;19(Suppl. 2):ii16–ii25. - PubMed
- Bailey T.L., Gribskov M. Combining evidence using p-values: Application to sequence homology searches. Bioinformatics. 1998;14:48–54. - PubMed
- Haverty P.M., Weng Z. CisML: an XML-based format for sequence motif detection software. Bioinformatics. 2004;20:1815–1817. - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases
Miscellaneous