Vectorize `adjacent_find` by AlexGuteniev · Pull Request #5331 · microsoft/STL (original) (raw)

The approach is similar to find vectorization, except for shifted load, which is similar to unique in #5092, Nothing novel 🥱

This time I haven't tried to avoid double load, I trust past experience that it is faster than blending with previous value, or using not the whole vector as a step.

I calculated Speedup as Before divided by After using a spreadsheet software.

Benchmark	Before	After	Speedup
bm<AlgType::Std. char>/2525/1142	317 ns	17.2 ns	18.43
bm<AlgType::Std. short>/2525/1142	295 ns	49.6 ns	5.95
bm<AlgType::Std. int>/2525/1142	285 ns	88.2 ns	3.23
bm<AlgType::Std. long long>/2525/1142	284 ns	161 ns	1.76
bm<AlgType::Rng. char>/2525/1142	282 ns	20.4 ns	13.82
bm<AlgType::Rng. short>/2525/1142	283 ns	47.1 ns	6.01
bm<AlgType::Rng. int>/2525/1142	280 ns	82.3 ns	3.40
bm<AlgType::Rng. long long>/2525/1142	289 ns	142 ns	2.04

Vectorize adjacent_find by AlexGuteniev · Pull Request #5331 · microsoft/STL (original) (raw)

Vectorize `adjacent_find` by AlexGuteniev · Pull Request #5331 · microsoft/STL (original) (raw)