Collect<Vec> from range doesn't optimize well. · Issue #43124 · rust-lang/rust (original) (raw)

(At least on x86-64 nightly)

Using code like this:
https://is.gd/nkoecB

The version using collect is significantly slower than creating a vec of 0-values and setting the values manually.

test using_collect ... bench:     117,777 ns/iter (+/- 6,424)
test using_manual  ... bench:       7,677 ns/iter (+/- 365)
test using_unsafe  ... bench:       3,866 ns/iter (+/- 394)

On the other hand, if using u32 instead with the same code collect is much better:

test using_collect ... bench:       7,677 ns/iter (+/- 555)
test using_manual  ... bench:      12,487 ns/iter (+/- 836)
test using_unsafe  ... bench:       7,741 ns/iter (+/- 413)

Same with u64:

test using_collect ... bench:      18,675 ns/iter (+/- 1,335)
test using_manual  ... bench:      29,692 ns/iter (+/- 1,864)
test using_unsafe  ... bench:      18,559 ns/iter (+/- 1,065)

I suspect this may be SIMD-related. Will see if there are similar results on stable.