Bioconda: sustainable and comprehensive software distribution for the life sciences (original) (raw)
To the Editor: Bioinformatics software comes in a variety of programming languages and requires diverse installation methods. This heterogeneity makes management of a software stack complicated, error-prone, and inordinately time-consuming. Whereas software deployment has traditionally been handled by administrators, ensuring the reproducibility of data analyses1,2,3 requires that the researcher be able to maintain full control of the software environment, rapidly modify it without administrative privileges, and reproduce the same software stack on different machines.
The Conda package manager (https://conda.io) has become an increasingly popular means to overcome these challenges for all major operating systems. Conda normalizes software installations across language ecosystems by describing each software with a human readable ‘recipe’ that defines meta-information and dependencies, as well as a simple ‘build script’ that performs the steps necessary to build and install the software. Conda builds software packages in an isolated environment, transforming them into relocatable binaries. Importantly, it obviates reliance on system-wide administration privileges by allowing users to generate isolated software environments in which they can manage software versions by project, without generating incompatibilities and side-effects (Supplementary Results). These environments support reproducibility, as they can be rapidly exchanged via files that describe their installation state. Conda is tightly integrated into popular solutions for reproducible data analysis such as Galaxy4, bcbio-nextgen (https://github.com/chapmanb/bcbio-nextgen), and Snakemake5. To further enhance reproducibility guarantees, Conda can be combined with container or virtual machine-based approaches and archive facilities such as Zenodo (Supplementary Results). Finally, although Conda provides many commonly used packages by default, it also allows users to optionally include additional, community-managed repositories of packages (termed channels).
This is a preview of subscription content, access via your institution
References
- Mesirov, J. P. Science 327, 415–416 (2010).
Article PubMed CAS Google Scholar - Baker, M. Nature 533, 452–454 (2016).
Article PubMed CAS Google Scholar - Munafò, M. R. et al. Nat. Hum. Behav. 1, 0021 (2017).
Article Google Scholar - Afgan, E. et al. Nucleic Acids Res. 44, W3–W10 (2016).
Article PubMed PubMed Central CAS Google Scholar - Köster, J. & Rahmann, S. Bioinformatics 28, 2520–2522 (2012).
Article PubMed CAS Google Scholar - Field, D. et al. Nat. Biotechnol. 24, 801–803 (2006).
Article PubMed CAS Google Scholar
Acknowledgements
We thank all contributors, the conda-forge team, and Anaconda Inc. for excellent cooperation. Further, we thank Travis CI (https://travis-ci.com) and Circle CI (https://circleci.com) for providing free Linux and macOS computing capacity. Finally, we thank ELIXIR (https://www.elixir-europe.org) for constant support and donation of staff. This work was supported by the Intramural Program of the National Institute of Diabetes and Digestive and Kidney Diseases, US National Institutes of Health (R.D.), the Netherlands Organisation for Scientific Research (NWO) (VENI grant 016.Veni.173.076 to J.K.), the German Research Foundation (SFB 876 to J.K.), and the NYU Abu Dhabi Research Institute for the NYU Abu Dhabi Center for Genomics and Systems Biology, program number CGSB1 (grant to J.R. and A. Yousif).
Author information
Author notes
- These authors contributed equally: Björn Grüning and Ryan Dale.
- A full list of authors and affiliations is available as Supplementary Table 1.
Authors and Affiliations
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Freiburg, Germany
Björn Grüning - Laboratory of Cellular and Developmental Biology, National Institute of Diabetes and Digestive and Kidney Diseases, US National Institutes of Health, Bethesda, MD, USA
Ryan Dale - Division of CBRN Security and Defence, FOI–Swedish Defence Research Agency, Umeå, Sweden
Andreas Sjödin - Department of Chemistry, Computational Life Science Cluster (CLiC), Umeå University, Umeå, Sweden
Andreas Sjödin - Harvard T.H. Chan School of Public Health, Boston, MA, USA
Brad A. Chapman - Center for Genomics and Systems Biology, Genomics Core,, NYU Abu Dhabi,, Abu Dhabi,, United Arab Emirates
Jillian Rowe - Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA
Christopher H. Tomkins-Tinch - Broad Institute of MIT and Harvard, Cambridge, MA, USA
Christopher H. Tomkins-Tinch - Laboratory of Bioinformatics and Computational Biology, A. C. Camargo Cancer Center, São Paulo, Brazil
Renan Valieris - Algorithms for Reproducible Bioinformatics, Genome Informatics, Institute of Human Genetics, University Hospital Essen, University of Duisburg–Essen, Essen, Germany
Johannes Köster - Medical Oncology, Dana Farber Cancer Institute, Harvard Medical School, Boston, MA, USA
Johannes Köster
Authors
- Björn Grüning
You can also search for this author inPubMed Google Scholar - Ryan Dale
You can also search for this author inPubMed Google Scholar - Andreas Sjödin
You can also search for this author inPubMed Google Scholar - Brad A. Chapman
You can also search for this author inPubMed Google Scholar - Jillian Rowe
You can also search for this author inPubMed Google Scholar - Christopher H. Tomkins-Tinch
You can also search for this author inPubMed Google Scholar - Renan Valieris
You can also search for this author inPubMed Google Scholar - Johannes Köster
You can also search for this author inPubMed Google Scholar
Consortia
The Bioconda Team
Contributions
J.K. and R.D. wrote the manuscript and conducted the data analysis. K. Beauchamp, C. Brueffer, B.A.C., F. Eggenhofer, B.G., E. Pruesse, M. Raden, J.R., D. Ryan, I. Shlyakter, A.S., C.H.T.-T., and R.V. (in alphabetical order) contributed to writing of the manuscript. D.A. Søndergaard supervised student programmers on writing Conda package recipes and maintaining the connection with ELIXIR. All other members of the Bioconda Team contributed or maintained recipes (author order was determined by the number of commits in October 2017).
Corresponding author
Correspondence toJohannes Köster.
Ethics declarations
Competing interests
The authors declare no competing interests.
Supplementary information
Rights and permissions
About this article
Cite this article
Grüning, B., Dale, R., Sjödin, A. et al. Bioconda: sustainable and comprehensive software distribution for the life sciences.Nat Methods 15, 475–476 (2018). https://doi.org/10.1038/s41592-018-0046-7
- Published: 02 July 2018
- Issue Date: July 2018
- DOI: https://doi.org/10.1038/s41592-018-0046-7