Using MCL to Extract Clusters from Networks (original) (raw)

Abstract

MCL is a general purpose cluster algorithm for both weighted and unweighted networks. The algorithm utilises network topology as well as edge weights, is highly scalable and has been applied in a wide variety of bioinformatic methods. In this chapter, we give protocols and case studies for clustering of networks derived from, respectively, protein sequence similarities and gene expression profile correlations.

References

van Dongen S. (2000) A cluster algorithm for graphs. Tech. rep., National Research Institute for Mathematics and Computer Science in the Netherlands, Amsterdam.
Google Scholar
van Dongen S. (2000) Graph clustering by flow simulation. PhD thesis, University of Utrecht.
Google Scholar
van Dongen S. (2008) Graph clustering via a discrete uncoupling process. SIAM J Matrix Anal Appl, 30:121–141.
Article Google Scholar
Enright A, van Dongen S, Ouzounis C. (2002) An efficient algorithm for the large-scale detection of protein families. Nucleic Acids Res, 7:1575–1584.
Article Google Scholar
Enright AJ, Kunin V, Ouzounis CA. (2003) Protein families and TRIBES in genome sequence space. Nucleic Acids Res, 31:4632–4638.
Article PubMed CAS Google Scholar
Li L, Stoeckert C, Roos D, OrthoMCL. (2003) Identification of ortholog groups for eukaryotic genomes. Genome Res, 13:2178–2189.
Article PubMed CAS Google Scholar
Pereira-Leal JB, Enright AJ, Ouzounis CA. (2004) Detection of functional modules from protein interaction networks. Proteins, 54:49–57.
Article CAS Google Scholar
Brohée S, van Helden J. (2006) Evaluation of clustering algorithms for protein-protein interaction networks. BMC Bioinformatics, 7:488.
Article PubMed Google Scholar
Samuel Lattimore B, van Dongen S, Crabbe MJ. (2005) GeneMCL in microarray analysis. Comput Biol Chem, 29:354–359.
Google Scholar
Freeman TC, et al. (2007) Construction, visualisation, and clustering of transcription networks from microarray expression data. PLoS Comput Biol, 3:2032–2042.
Article PubMed CAS Google Scholar
Lopez F, et al. (2008) TranscriptomeBrowser: a powerful and flexible toolbox to explore productively the transcriptional landscape of the Gene Expression Omnibus database. PLoS ONE, 3:e4001.
Article PubMed Google Scholar
Theodosiou T, et al. (2008) PuReD-MCL: a graph-based PubMed document clustering methodology. Bioinformatics, 24:1935–1941.
Article PubMed CAS Google Scholar
Hubbard TJ, et al. (2009) Ensembl. Nucleic Acids Res, 37:D690–697.
Article PubMed CAS Google Scholar
Chen F, et al. (2007) Assessing performance of orthology detection strategies applied to eukaryotic genomes. PLoS ONE, 2:e383.
Article PubMed Google Scholar
Theocharidis A, et al. (2009) Network visualization and analysis of gene expression data using BioLayout Express(3D). Nat Protoc, 4:1535–1550.
Article PubMed CAS Google Scholar
Brohee S, Faust K, Lima-Mendez G, Sand O, Janky R, Vanderstocken G, Deville Y, van Helden J. (2008) NeAT: a toolbox for the analysis of biological networks, clusters, classes and pathways. Nucleic Acids Res, 36:W444–W451.
Article PubMed CAS Google Scholar
King AD, Przulj N, Jurisica I. (2004) Protein complex prediction via costbased clustering. Bioinformatics, 20:3013–3020.
Article PubMed CAS Google Scholar
Darby AC, et al. (2007) Intracellular pathogens go extreme: genome evolution in the Rickettsiales. Trends Genet, 23:511–520.
Article PubMed CAS Google Scholar
d′Haeseleer P. (2005) How does gene expression clustering work? Nat Biotechnol, 23:1499–1501.
Google Scholar
van Noort V, Snel B, Huynen MA. (2003) Predicting gene function by conserved co-expression. Trends Genet, 19:238–242.
Article PubMed Google Scholar
Faith JJ, et al. (2008) Many Microbe Microarrays Database: uniformly normalized Affymetrix compendia with structured experimental metadata. Nucleic Acids Res, 36:D866–870.
Article PubMed CAS Google Scholar
Gama-Castro S, et al. (2008) RegulonDB (version 6.0): gene regulation model of Escherichia coli K-12 beyond transcription, active (experimental) annotated promoters and Textpresso navigation. Nucleic Acids Res, 36:D120–124.
Google Scholar
Keseler IM, et al. (2009) EcoCyc: a comprehensive view of Escherichia coli biology. Nucleic Acids Res, 37:D464–470.
Article PubMed CAS Google Scholar
Bairoch A, et al. (2009) The universal protein resource (UniProt) 2009. Nucleic Acids Res, 37:D169–D174.
Article Google Scholar
van Dongen S. (2000) Performance criteria for graph clustering and Markov cluster experiments. Tech. rep., National Research Institute for Mathematics and Computer Science in the Netherlands, Amsterdam. [http://www.cwi.nl/static/publications/reports/INS-2000.html].
Ogata H, Audic S, Barbe V, Artiguenave F, Fournier PE, Raoult D, Claverie JM. (2000) Selfish DNA in protein-coding genes of Rickettsia. Science, 290:347–350.
Article PubMed CAS Google Scholar
Neidhardt FC, Curtiss R. (1996) Escherichia Coli and Salmonella: Cellular and Molecular Biology. 2nd ed. ASM Press, Washington. [Walker GC. The SOS response of Escherichia coli. 1400–1416].
Google Scholar
Huang DW, Sherman BT, Lempicki RA. (2009) Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc, 4:44–57.
Article CAS Google Scholar

Download references

Author information

Authors and Affiliations

European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge, UK
Stijn van Dongen & Cei Abreu-Goodger

Authors

Stijn van Dongen
You can also search for this author inPubMed Google Scholar
Cei Abreu-Goodger
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence toStijn van Dongen .

Editor information

Editors and Affiliations

Labo. Bioinformatique des, Génomes et des Réseaux (BiGRe), Université Libre de Bruxelles, bd. du Triomphe, Bruxelles, 1050, Belgium
Jacques van Helden
Labo. Bioinformatique des, Génomes et des Réseaux (BiGRe), Université Libre de Bruxelles, Bvd. du Triomphe, Bruxelles, 1050, Belgium
Ariane Toussaint
INSERM 1024, Institut de Biologie de l'Ecole Normale, rue d'Ulm 46, Paris, 75230, France
Denis Thieffry

Rights and permissions

Copyright information

About this protocol

Cite this protocol

van Dongen, S., Abreu-Goodger, C. (2012). Using MCL to Extract Clusters from Networks. In: van Helden, J., Toussaint, A., Thieffry, D. (eds) Bacterial Molecular Networks. Methods in Molecular Biology, vol 804. Springer, New York, NY. https://doi.org/10.1007/978-1-61779-361-5\_15

Download citation

.RIS
.ENW
.BIB
DOI: https://doi.org/10.1007/978-1-61779-361-5\_15
Published: 28 October 2011
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-61779-360-8
Online ISBN: 978-1-61779-361-5
eBook Packages: Springer Protocols