FIMO: scanning for occurrences of a given motif - PubMed (original) (raw)
FIMO: scanning for occurrences of a given motif
Charles E Grant et al. Bioinformatics. 2011.
Abstract
A motif is a short DNA or protein sequence that contributes to the biological function of the sequence in which it resides. Over the past several decades, many computational methods have been described for identifying, characterizing and searching with sequence motifs. Critical to nearly any motif-based sequence analysis pipeline is the ability to scan a sequence database for occurrences of a given motif described by a position-specific frequency matrix.
Results: We describe Find Individual Motif Occurrences (FIMO), a software tool for scanning DNA or protein sequences with motifs described as position-specific scoring matrices. The program computes a log-likelihood ratio score for each position in a given sequence database, uses established dynamic programming methods to convert this score to a P-value and then applies false discovery rate analysis to estimate a q-value for each position in the given sequence. FIMO provides output in a variety of formats, including HTML, XML and several Santa Cruz Genome Browser formats. The program is efficient, allowing for the scanning of DNA sequences at a rate of 3.5 Mb/s on a single CPU.
Availability and implementation: FIMO is part of the MEME Suite software toolkit. A web server and source code are available at http://meme.sdsc.edu.
Figures
Fig. 1.
Using FIMO to identify candidate CTCF binding sites in the human genome. (A) Sample FIMO HTML output, showing the locations of the top-scoring occurrences of the CTCF motif in the human genome. (B) A precision-recall curve created by comparing FIMO's ranked list of CTCF sites with a gold standard derived from a ChIP-seq experiment.
Similar articles
- MEME SUITE: tools for motif discovery and searching.
Bailey TL, Boden M, Buske FA, Frith M, Grant CE, Clementi L, Ren J, Li WW, Noble WS. Bailey TL, et al. Nucleic Acids Res. 2009 Jul;37(Web Server issue):W202-8. doi: 10.1093/nar/gkp335. Epub 2009 May 20. Nucleic Acids Res. 2009. PMID: 19458158 Free PMC article. - MEME: discovering and analyzing DNA and protein sequence motifs.
Bailey TL, Williams N, Misleh C, Li WW. Bailey TL, et al. Nucleic Acids Res. 2006 Jul 1;34(Web Server issue):W369-73. doi: 10.1093/nar/gkl198. Nucleic Acids Res. 2006. PMID: 16845028 Free PMC article. - rMotifGen: random motif generator for DNA and protein sequences.
Rouchka EC, Hardin CT. Rouchka EC, et al. BMC Bioinformatics. 2007 Aug 7;8:292. doi: 10.1186/1471-2105-8-292. BMC Bioinformatics. 2007. PMID: 17683637 Free PMC article. - Epigenetic priors for identifying active transcription factor binding sites.
Cuellar-Partida G, Buske FA, McLeay RC, Whitington T, Noble WS, Bailey TL. Cuellar-Partida G, et al. Bioinformatics. 2012 Jan 1;28(1):56-62. doi: 10.1093/bioinformatics/btr614. Epub 2011 Nov 8. Bioinformatics. 2012. PMID: 22072382 Free PMC article. - Discovering sequence motifs.
Bailey TL. Bailey TL. Methods Mol Biol. 2008;452:231-51. doi: 10.1007/978-1-60327-159-2_12. Methods Mol Biol. 2008. PMID: 18566768 Review.
Cited by
- Widespread reorganisation of pluripotent factor binding and gene regulatory interactions between human pluripotent states.
Chovanec P, Collier AJ, Krueger C, Várnai C, Semprich CI, Schoenfelder S, Corcoran AE, Rugg-Gunn PJ. Chovanec P, et al. Nat Commun. 2021 Apr 7;12(1):2098. doi: 10.1038/s41467-021-22201-4. Nat Commun. 2021. PMID: 33828098 Free PMC article. - Single Cell RNA-Seq and Machine Learning Reveal Novel Subpopulations in Low-Grade Inflammatory Monocytes With Unique Regulatory Circuits.
Lee J, Geng S, Li S, Li L. Lee J, et al. Front Immunol. 2021 Feb 23;12:627036. doi: 10.3389/fimmu.2021.627036. eCollection 2021. Front Immunol. 2021. PMID: 33708217 Free PMC article. - Systems-biology analysis of rheumatoid arthritis fibroblast-like synoviocytes implicates cell line-specific transcription factor function.
Ainsworth RI, Hammaker D, Nygaard G, Ansalone C, Machado C, Zhang K, Zheng L, Carrillo L, Wildberg A, Kuhs A, Svensson MND, Boyle DL, Firestein GS, Wang W. Ainsworth RI, et al. Nat Commun. 2022 Oct 20;13(1):6221. doi: 10.1038/s41467-022-33785-w. Nat Commun. 2022. PMID: 36266270 Free PMC article. - Functional characterization of RebL1 highlights the evolutionary conservation of oncogenic activities of the RBBP4/7 orthologue in Tetrahymena thermophila.
Nabeel-Shah S, Garg J, Saettone A, Ashraf K, Lee H, Wahab S, Ahmed N, Fine J, Derynck J, Pu S, Ponce M, Marcon E, Zhang Z, Greenblatt JF, Pearlman RE, Lambert JP, Fillingham J. Nabeel-Shah S, et al. Nucleic Acids Res. 2021 Jun 21;49(11):6196-6212. doi: 10.1093/nar/gkab413. Nucleic Acids Res. 2021. PMID: 34086947 Free PMC article. - Diverse noncoding mutations contribute to deregulation of cis-regulatory landscape in pediatric cancers.
He B, Gao P, Ding YY, Chen CH, Chen G, Chen C, Kim H, Tasian SK, Hunger SP, Tan K. He B, et al. Sci Adv. 2020 Jul 24;6(30):eaba3064. doi: 10.1126/sciadv.aba3064. eCollection 2020 Jul. Sci Adv. 2020. PMID: 32832663 Free PMC article.
References
- Bailey T.L., Noble W.S. Searching for statistically significant regulatory modules. Bioinformatics. 2003;19(Suppl. 2):ii16–ii25. - PubMed
- Bailey T.L., Gribskov M. Combining evidence using p-values: Application to sequence homology searches. Bioinformatics. 1998;14:48–54. - PubMed
- Haverty P.M., Weng Z. CisML: an XML-based format for sequence motif detection software. Bioinformatics. 2004;20:1815–1817. - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases
Miscellaneous