Benchmark count for vector<bool> by AlexGuteniev · Pull Request #5684 · microsoft/STL (original) (raw)

There is count for vector<bool> optimization that uses popcnt on the integer elements of the vector<bool> internal representation, originally added in #1131. There's open PR #5640 to enhance that optimization further.

This PR adds benchmark to measure the results.

I've started off copying vector_bool_copy.cpp to mimic the existing style, then left only the _algined case, since unalignment doesn't make significant impact (unlike copying), still left the same name (as just count matches the STL algorithm name), and added DoNotOptimize where necessary. The value to count is alternating to explore both branches without adding extra benchmarks.


The results for #5640 are mixed for me.

On P cores of i5-1235U I see no improvement:

Benchmark Before After Speedup
count_aligned/64 17.0 ns 17.0 ns 1.00
count_aligned/4096 61.5 ns 59.6 ns 1.03
count_aligned/65536 718 ns 747 ns 0.96

On E cores I see some improvement, which is not too little for such a small change:

Benchmark Before After Speedup
count_aligned/64 21.3 ns 21.6 ns 0.99
count_aligned/4096 114 ns 90.6 ns 1.26
count_aligned/65536 1505 ns 1092 ns 1.38