lefse – The Huttenhower Lab (original) (raw)

LEfSe (Linear discriminant analysis Effect Size) determines the features (organisms, clades, operational taxonomic units, genes, or functions) most likely to explain differences between classes by coupling standard tests for statistical significance with additional tests encoding biological consistency and effect relevance.

LEfSe is available as a Galaxy module, a Conda formula, a Docker image, and included in bioBakery (VM and cloud). For additional information, please refer to the LEfSe paper.

Github Repository || Galaxy Module || Tutorial || Forum

Citation:

If you find LEfSe useful in your research please cite our paper (Segata et. al 2010):

Genome Biology, 2011 Jun 24;12(6):R60

Installation (Conda/Docker/VM)

LEfSe can be installed with Conda or run from a Docker image. Please note, if you are using bioBakery (Vagrant VM or cloud) you do not need to install LEfSe because the tool and its dependencies are already installed.

Install with Conda: $ conda install -c biobakery lefse

Install with Docker: $ docker run -it biobakery/lefse bash

The text tab-delimited input file consists of a list of numerical features, the class vector and optionally the subclass and subject vectors. The features can be read counts directly or abundance floating-point values more generally, and the first field is the name of the feature. Class, subclass and subject vectors have a name (the first field) and a list of non-numerical strings.

Although both column and row feature organization is accepted, given the high-dimensional nature of metagenomic data, the listing of the features in rows is preferred.

For the purpose of this tutorial we will be using a sample input file (hmp_small_aerobiosis.txt).

image

format_lefse_graphlan.png

lda_lefse_graphlan.png

plot_lefse_graphlan.png

plotii_lefse_graphlan.png

clad_lefse_graphlan.png

cladii_lefse_graphlan.png

To visualize the results, LEfSe provides a couple of options. For all the options you will need the output from run_lefse.py (in this case: hmp_aerobiosis_small.res)

$ plot_res.py hmp_aerobiosis_small.res hmp_aerobiosis_small.png  

hmp_aerobiosis_small.png

$ plot_cladogram.py hmp_aerobiosis_small.res hmp_aerobiosis_small.cladogram.png --format png  

hmp_aerobiosis_small.cladogram.png