Using MCL to Extract Clusters from Networks (original) (raw)

Abstract

MCL is a general purpose cluster algorithm for both weighted and unweighted networks. The algorithm utilises network topology as well as edge weights, is highly scalable and has been applied in a wide variety of bioinformatic methods. In this chapter, we give protocols and case studies for clustering of networks derived from, respectively, protein sequence similarities and gene expression profile correlations.

Similar content being viewed by others

References

  1. van Dongen S. (2000) A cluster algorithm for graphs. Tech. rep., National Research Institute for Mathematics and Computer Science in the Netherlands, Amsterdam.
    Google Scholar
  2. van Dongen S. (2000) Graph clustering by flow simulation. PhD thesis, University of Utrecht.
    Google Scholar
  3. van Dongen S. (2008) Graph clustering via a discrete uncoupling process. SIAM J Matrix Anal Appl, 30:121–141.
    Article Google Scholar
  4. Enright A, van Dongen S, Ouzounis C. (2002) An efficient algorithm for the large-scale detection of protein families. Nucleic Acids Res, 7:1575–1584.
    Article Google Scholar
  5. Enright AJ, Kunin V, Ouzounis CA. (2003) Protein families and TRIBES in genome sequence space. Nucleic Acids Res, 31:4632–4638.
    Article PubMed CAS Google Scholar
  6. Li L, Stoeckert C, Roos D, OrthoMCL. (2003) Identification of ortholog groups for eukaryotic genomes. Genome Res, 13:2178–2189.
    Article PubMed CAS Google Scholar
  7. Pereira-Leal JB, Enright AJ, Ouzounis CA. (2004) Detection of functional modules from protein interaction networks. Proteins, 54:49–57.
    Article CAS Google Scholar
  8. Brohée S, van Helden J. (2006) Evaluation of clustering algorithms for protein-protein interaction networks. BMC Bioinformatics, 7:488.
    Article PubMed Google Scholar
  9. Samuel Lattimore B, van Dongen S, Crabbe MJ. (2005) GeneMCL in microarray analysis. Comput Biol Chem, 29:354–359.
    Google Scholar
  10. Freeman TC, et al. (2007) Construction, visualisation, and clustering of transcription networks from microarray expression data. PLoS Comput Biol, 3:2032–2042.
    Article PubMed CAS Google Scholar
  11. Lopez F, et al. (2008) TranscriptomeBrowser: a powerful and flexible toolbox to explore productively the transcriptional landscape of the Gene Expression Omnibus database. PLoS ONE, 3:e4001.
    Article PubMed Google Scholar
  12. Theodosiou T, et al. (2008) PuReD-MCL: a graph-based PubMed document clustering methodology. Bioinformatics, 24:1935–1941.
    Article PubMed CAS Google Scholar
  13. Hubbard TJ, et al. (2009) Ensembl. Nucleic Acids Res, 37:D690–697.
    Article PubMed CAS Google Scholar
  14. Chen F, et al. (2007) Assessing performance of orthology detection strategies applied to eukaryotic genomes. PLoS ONE, 2:e383.
    Article PubMed Google Scholar
  15. Theocharidis A, et al. (2009) Network visualization and analysis of gene expression data using BioLayout Express(3D). Nat Protoc, 4:1535–1550.
    Article PubMed CAS Google Scholar
  16. Brohee S, Faust K, Lima-Mendez G, Sand O, Janky R, Vanderstocken G, Deville Y, van Helden J. (2008) NeAT: a toolbox for the analysis of biological networks, clusters, classes and pathways. Nucleic Acids Res, 36:W444–W451.
    Article PubMed CAS Google Scholar
  17. King AD, Przulj N, Jurisica I. (2004) Protein complex prediction via costbased clustering. Bioinformatics, 20:3013–3020.
    Article PubMed CAS Google Scholar
  18. Darby AC, et al. (2007) Intracellular pathogens go extreme: genome evolution in the Rickettsiales. Trends Genet, 23:511–520.
    Article PubMed CAS Google Scholar
  19. d′Haeseleer P. (2005) How does gene expression clustering work? Nat Biotechnol, 23:1499–1501.
    Google Scholar
  20. van Noort V, Snel B, Huynen MA. (2003) Predicting gene function by conserved co-expression. Trends Genet, 19:238–242.
    Article PubMed Google Scholar
  21. Faith JJ, et al. (2008) Many Microbe Microarrays Database: uniformly normalized Affymetrix compendia with structured experimental metadata. Nucleic Acids Res, 36:D866–870.
    Article PubMed CAS Google Scholar
  22. Gama-Castro S, et al. (2008) RegulonDB (version 6.0): gene regulation model of Escherichia coli K-12 beyond transcription, active (experimental) annotated promoters and Textpresso navigation. Nucleic Acids Res, 36:D120–124.
    Google Scholar
  23. Keseler IM, et al. (2009) EcoCyc: a comprehensive view of Escherichia coli biology. Nucleic Acids Res, 37:D464–470.
    Article PubMed CAS Google Scholar
  24. Bairoch A, et al. (2009) The universal protein resource (UniProt) 2009. Nucleic Acids Res, 37:D169–D174.
    Article Google Scholar
  25. van Dongen S. (2000) Performance criteria for graph clustering and Markov cluster experiments. Tech. rep., National Research Institute for Mathematics and Computer Science in the Netherlands, Amsterdam. [http://www.cwi.nl/static/publications/reports/INS-2000.html].
  26. Ogata H, Audic S, Barbe V, Artiguenave F, Fournier PE, Raoult D, Claverie JM. (2000) Selfish DNA in protein-coding genes of Rickettsia. Science, 290:347–350.
    Article PubMed CAS Google Scholar
  27. Neidhardt FC, Curtiss R. (1996) Escherichia Coli and Salmonella: Cellular and Molecular Biology. 2nd ed. ASM Press, Washington. [Walker GC. The SOS response of Escherichia coli. 1400–1416].
    Google Scholar
  28. Huang DW, Sherman BT, Lempicki RA. (2009) Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc, 4:44–57.
    Article CAS Google Scholar

Download references

Author information

Authors and Affiliations

  1. European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge, UK
    Stijn van Dongen & Cei Abreu-Goodger

Authors

  1. Stijn van Dongen
    You can also search for this author inPubMed Google Scholar
  2. Cei Abreu-Goodger
    You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence toStijn van Dongen .

Editor information

Editors and Affiliations

  1. Labo. Bioinformatique des, Génomes et des Réseaux (BiGRe), Université Libre de Bruxelles, bd. du Triomphe, Bruxelles, 1050, Belgium
    Jacques van Helden
  2. Labo. Bioinformatique des, Génomes et des Réseaux (BiGRe), Université Libre de Bruxelles, Bvd. du Triomphe, Bruxelles, 1050, Belgium
    Ariane Toussaint
  3. INSERM 1024, Institut de Biologie de l'Ecole Normale, rue d'Ulm 46, Paris, 75230, France
    Denis Thieffry

Rights and permissions

© 2012 Springer Science+Business Media, LLC

About this protocol

Cite this protocol

van Dongen, S., Abreu-Goodger, C. (2012). Using MCL to Extract Clusters from Networks. In: van Helden, J., Toussaint, A., Thieffry, D. (eds) Bacterial Molecular Networks. Methods in Molecular Biology, vol 804. Springer, New York, NY. https://doi.org/10.1007/978-1-61779-361-5\_15

Download citation

Publish with us