HsMetrics PCT_USABLE_BASES_ON_BAIT definition/calculation error (original) (raw)
Hi!
On the webpage https://broadinstitute.github.io/picard/picard-metric-definitions.html#HsMetrics ; the HsMetrics output named "PCT_USABLE_BASES_ON_BAIT" is defined as "The number of aligned, de-duped, on-bait bases out of the PF bases available.". However, if you check line 91 in https://github.com/broadinstitute/picard/blob/master/src/main/java/picard/analysis/directed/HsMetricCollector.java , as well as lines 531-550 in https://github.com/broadinstitute/picard/blob/master/src/main/java/picard/analysis/directed/TargetMetricsCollector.java, you can see that this metric uses aligned on-bait bases, without considering duplicates. This results in discrepancies between PCT_USABLE_BASES_ON_BAIT and PCT_USABLE_BASES_ON_TARGET, because the latter is calculated using de-duped counts. Just wanted to raise the issue so that the definition can be corrected!
Best regards