Changes in deepTools2.0 — deepTools 3.5.6 documentation (original) (raw)
Major changes
Note
The major changes encompass features for increased efficiency, new sequencing data types, and additional plots, particularly for QC.
Moreover, deepTools modules can now be used by other python programs. The deepTools API example is part of the new documentation.
Accommodating additional data types
- correlation and comparisons can now be calculated for bigWig files (in addition to BAM files) using
multiBigwigSummary
andbigwigCompare
- RNA-seq: split-reads are now natively supported
- MNase-seq: using the new option
--MNase
inbamCoverage
, one can now compute read coverage only taking the 2 central base pairs of each mapped fragment into account.
Structural updates
- All modules have comprehensive and automatic tests that evaluate proper functioning after any modification of the code.
- Virtualization for stability: we now provide a
docker
image and enable the easy deployment of deepTools via the Galaxytoolshed
. - Our documentation is now version-aware thanks to readthedocs and
sphinx
. - The API is public and documented.
Renamed tools
- heatmapper to plotHeatmap
- profiler to plotProfile
- bamCorrelate to multiBamSummary
- bigwigCorrelate to multiBigwigSummary
- bamFingerprint to plotFingerprint
Increased efficiency
- We dramatically improved the speed of bigwig related tools (multiBigwigSummary and
computeMatrix
) by using the new pyBigWig module. - It is now possible to generate one composite heatmap and/or meta-gene image based on multiple bigwig files in one go (see computeMatrix, plotHeatmap, and plotProfile for examples)
computeMatrix
now also accepts multiple input BED files. Each is treated as a group within a sample and is plotted independently.- We added additional filtering options for handling BAM files, decreasing the need for prior filtering using tools other than deepTools: The
--samFlagInclude
and--samFlagExclude
parameters can, for example, be used to only include (or exclude) forward reads in an analysis. - We separated the generation of read count tables from the calculation of pairwise correlations that was previously handled by
bamCorrelate
. Now, read counts are calculated first usingmultiBamSummary
ormultiBigWigCoverage
and the resulting output file can be used for calculating and plotting pairwise correlations usingplotCorrelation
or for doing a principal component analysis usingplotPCA
.
New features and tools
- Correlation analyses are no longer limited to BAM files – bigwig files are possible, too! (see multiBigwigSummary)
- Correlation coefficients can now be computed even if the data contains NaNs.
- Added new quality control tools:
- use plotCoverage to plot the coverage over base pairs
- use plotPCA for principal component analysis
- bamPEFragmentSize can be used to calculate the average fragment size for paired-end read data
- Added the possibility for hierarchical clustering, besides _k_-means to
plotProfile
andplotHeatmap
plotProfile
has many more options to make compelling summary plots
Minor changes
Changed parameters names and settings
computeMatrix
can now read files with DOS newline characters.--missingDataAsZero
was renamed to--skipNonCoveredRegions
for clarity inbamCoverage
andbamCompare
.- Read extension was made optional and we removed the need to specify a default fragment length for most of the tools:
--fragmentLength
was thus replaced by the new optional parameter--extendReads
. - Added option
--skipChromosomes
tomultiBigwigSummary
, which can be used to, for example, skip all ‘random’ chromosomes. - Added the option for adding titles to QC plots.
Bug fixes
- Resolved an error introduced by
numpy version 1.10
incomputeMatrix
. - Improved plotting features for
plotProfile
when using as plot type: ‘overlapped_lines’ and ‘heatmap’ - Fixed problem with BED intervals in
multiBigwigSummary
andmultiBamSummary
that returned wrongly labeled raw counts. multiBigwigSummary
now also considers chromosomes as identical when the names between samples differ by ‘chr’ prefix, e.g. chr1 vs. 1.- Fixed problem with wrongly labeled proper read pairs in a BAM file. We now have additional checks to determine if a read pair is a proper pair: the reads must face each other and are not allowed to be farther apart than 4x the mean fragment length.
- For
bamCoverage
andbamCompare
, the behavior ofscaleFactor
was updated such that now, if given in combination with the normalization options (--normalizeTo1x
or--normalizeUsingRPKM
), the given scaling factor will be multiplied with the factor computed by the respective normalization method.