Vectorize basic_string::find by AlexGuteniev · Pull Request #5101 · microsoft/STL (original) (raw)

5950X benchmarks look good:

Benchmark Before After Speedup
bm<char, not_highly_aligned_allocator, Op::StringFind>/8021/3056 150 ns 49.3 ns 3.04
bm<char, not_highly_aligned_allocator, Op::StringFind>/63/62 14.9 ns 4.83 ns 3.08
bm<char, not_highly_aligned_allocator, Op::StringFind>/31/30 16.6 ns 8.50 ns 1.95
bm<char, not_highly_aligned_allocator, Op::StringFind>/15/14 9.86 ns 7.93 ns 1.24
bm<char, not_highly_aligned_allocator, Op::StringFind>/7/6 5.92 ns 5.61 ns 1.06
bm<char, highly_aligned_allocator, Op::StringFind>/8021/3056 149 ns 49.9 ns 2.99
bm<char, highly_aligned_allocator, Op::StringFind>/63/62 17.0 ns 4.88 ns 3.48
bm<char, highly_aligned_allocator, Op::StringFind>/31/30 16.5 ns 8.04 ns 2.05
bm<char, highly_aligned_allocator, Op::StringFind>/15/14 9.84 ns 7.75 ns 1.27
bm<char, highly_aligned_allocator, Op::StringFind>/7/6 5.93 ns 5.42 ns 1.09
bm<wchar_t, not_highly_aligned_allocator, Op::StringFind>/8021/3056 656 ns 87.6 ns 7.49
bm<wchar_t, not_highly_aligned_allocator, Op::StringFind>/63/62 16.1 ns 4.70 ns 3.43
bm<wchar_t, not_highly_aligned_allocator, Op::StringFind>/31/30 9.77 ns 4.26 ns 2.29
bm<wchar_t, not_highly_aligned_allocator, Op::StringFind>/15/14 5.78 ns 5.81 ns 0.99
bm<wchar_t, not_highly_aligned_allocator, Op::StringFind>/7/6 3.36 ns 5.37 ns 0.63
bm<char32_t, not_highly_aligned_allocator, Op::StringFind>/8021/3056 1289 ns 165 ns 7.81
bm<char32_t, not_highly_aligned_allocator, Op::StringFind>/63/62 32.9 ns 5.33 ns 6.17
bm<char32_t, not_highly_aligned_allocator, Op::StringFind>/31/30 14.9 ns 4.53 ns 3.29
bm<char32_t, not_highly_aligned_allocator, Op::StringFind>/15/14 8.18 ns 4.05 ns 2.02
bm<char32_t, not_highly_aligned_allocator, Op::StringFind>/7/6 4.22 ns 4.68 ns 0.90