add function dotproduct and unit tests by tnaake · Pull Request #17 · rformassspectrometry/MsCoreUtils (original) (raw)

Dear @sgibb
thanks for pointing this out.

For two spectra with "lower" m/z value, we get a lower similarity score. If we set a higher n we will get a higher similarity score.

This means 2 of 3 shared peaks. I used the formula implemented in this dotproduct function (containing weights for m/z and intensities) before and it was also used in some publications. What we could implement is, that n=0 and m=1 (default), i.e. m/z values will not be used, but only intensity values as they are.

x1 <- data.frame(mz = c(101, NA, 201), intensity = c(1, 0, 1))
y1 <- data.frame(mz = c(101, 102, 201), intensity = c(1, 1, 1))
x2 <- data.frame(mz = c(101, NA, 201), intensity = c(3, 0, 5)) 
y2 <- data.frame(mz = c(101, 102, 201), intensity = c(3,4, 5))
dotproduct(x1, y1, m=1, n=0) ## 0.6666667
dotproduct(x2, y2, m=1, n=0) ## 0.68