Fastest way to initialize a vector is not documented (original) (raw)

There are at least thee distinct ways to create a zero-filled vector with a certain capacity:

// resize let mut vec1 = Vec::with_capacity(len); vec1.resize(len, 0); // extend let mut vec2 = Vec::with_capacity(len); vec2.extend(repeat(0).take(len)) // vec! macro let mut vec3 = vec![0; len];

Despite the latter being the most concise one, other solutions also show up in real-world code. The performance characteristics of these solutions are not obvious, and not documented - at least on the Vec page in stdlib reference.

This has led to an actual security vulnerability in Claxon crate, now known as RUSTSEC-2018-0004. On some malformed inputs contents of uninitialized memory would show up in the output. See the original bug report or the security advisory for more details.

Like most binary format decoders, Claxon writes into a preallocated buffer. Memory unsafety that led to the vulnerability was introduced to speed up initialization of the vector. Initialization was originally performed like this: buffer.extend(repeat(0).take(new_len - len)), but was replaced with unsafe { buffer.set_len(new_len); } for performance (see relevant commit).

Anything that desugars into RawVec::with_capacity_zeroed() is dramatically more efficient than .extend(), at least on Linux. One public wrapper for this function is vec! macro. Replacing vec::with_capacity(new_len); unsafe { buffer.set_len(new_len); } with vec![0; new_len]; created no measurable performance difference in Claxon, eliminating the need to unsafe code.

I believe that clearly documenting that the vec! macro has special behavior when given 0 as argument and is dramatically more efficient than other means of initialization (at least on some platforms) would have prevented this vulnerability.