Vectorize adjacent_find by AlexGuteniev 路 Pull Request #5331 路 microsoft/STL (original) (raw)

The approach is similar to find vectorization, except for shifted load, which is similar to unique in #5092, Nothing novel 馃ケ

This time I haven't tried to avoid double load, I trust past experience that it is faster than blending with previous value, or using not the whole vector as a step.

I calculated Speedup as Before divided by After using a spreadsheet software.

Benchmark Before After Speedup
bm<AlgType::Std. char>/2525/1142 317 ns 17.2 ns 18.43
bm<AlgType::Std. short>/2525/1142 295 ns 49.6 ns 5.95
bm<AlgType::Std. int>/2525/1142 285 ns 88.2 ns 3.23
bm<AlgType::Std. long long>/2525/1142 284 ns 161 ns 1.76
bm<AlgType::Rng. char>/2525/1142 282 ns 20.4 ns 13.82
bm<AlgType::Rng. short>/2525/1142 283 ns 47.1 ns 6.01
bm<AlgType::Rng. int>/2525/1142 280 ns 82.3 ns 3.40
bm<AlgType::Rng. long long>/2525/1142 289 ns 142 ns 2.04