Collect<Vec> from range doesn't optimize well. · Issue #43124 · rust-lang/rust (original) (raw)
(At least on x86-64 nightly)
Using code like this:
https://is.gd/nkoecB
The version using collect is significantly slower than creating a vec of 0-values and setting the values manually.
test using_collect ... bench: 117,777 ns/iter (+/- 6,424)
test using_manual ... bench: 7,677 ns/iter (+/- 365)
test using_unsafe ... bench: 3,866 ns/iter (+/- 394)
On the other hand, if using u32 instead with the same code collect is much better:
test using_collect ... bench: 7,677 ns/iter (+/- 555)
test using_manual ... bench: 12,487 ns/iter (+/- 836)
test using_unsafe ... bench: 7,741 ns/iter (+/- 413)
Same with u64:
test using_collect ... bench: 18,675 ns/iter (+/- 1,335)
test using_manual ... bench: 29,692 ns/iter (+/- 1,864)
test using_unsafe ... bench: 18,559 ns/iter (+/- 1,065)
I suspect this may be SIMD-related. Will see if there are similar results on stable.