GitHub - lczech/gappa: A toolkit for analyzing and visualizing phylogenetic (placement) data (original) (raw)
Features
gappa is a collection of commands for working with phylogenetic data. Its main focus are evolutionary placements of short environmental sequences on a reference phylogenetic tree. Such data are typically produced by tools such as EPA-ng,RAxML-EPA orpplacer, and usually stored injplace files.
See the Wiki pagesfor the full list of all subcommands and their documentation.
We recommend our article "Metagenomic Analysis Using Phylogenetic Placement — A Review of the First Decade" as an introduction to the topic of phylogenetic placement.
Setup
There are two ways to get gappa:
- Install it via conda.
- Build it from source.
While conda will only give you proper releases, building from source will give you the latest (development) version. For that, simply get it, and build it:
git clone --recursive https://github.com/lczech/gappa.git cd gappa make
You can also use the green "Code" button above orclick here to download the source as a zip archive. Unpack, and call make
in the main directory to build everything.
Requirements:
- Make and CMake 3.1 or higher.
- A fairly up-to-date C++11 compiler, e.g.,clang++ 3.6 or GCC 4.9, or higher.
After building, the executable is stored in the bin
directory, and used as follows.
Usage and Documentation
gappa is used via its command line interface, with subcommands for each task. The commands have the general structure:
gappa <module> <subcommand> <options>
See the Wiki pagesfor the full list of all subcommands and their documentation.
For bug reports and feature requests of gappa, pleaseopen an issue on GitHub.
For user support, please see ourPhylogenetic Placement Google Group. It is intended for discussions about phylogenetic placement, and for user support for our software tools, such as EPA-ng,gappa, and genesis.
Citation
To generally cite gappa, please use
Genesis and Gappa: processing, analyzing and visualizing phylogenetic (placement) data.
Lucas Czech, Pierre Barbera, and Alexandros Stamatakis.
Bioinformatics, 2020. https://doi.org/10.1093/bioinformatics/btaa070
Each command also prints out the relevant references for that command. Then, the command gappa tools citation can be used to obtain details on those references. See also our Wiki page Citation and References for a list of all references.
Lastly, we recommend reading our comprehensive review of the topic
Metagenomic Analysis Using Phylogenetic Placement—A Review of the First Decade.
Lucas Czech, Alexandros Stamatakis, Micah Dunthorn, and Pierre Barbera.
Frontiers in Bioinformatics, 2022. https://doi.org/10.3389/fbinf.2022.871393
to get an overview of phylogenetic placement and its methods.
Behind the scenes
gappa is short for Genesis Applications for Phylogenetic Placement Analysis. This is because most of the work of gappa is actually performed by our genesis library. See there if you are interested in the implementation details.