perf: remove loop from str::floor_char_boundary by overlookmotel · Pull Request #149466 · rust-lang/rust (original) (raw)
#144472 made str::floor_char_boundary a const function, but in doing so introduced a loop. This is unnecessary because the next UTF-8 character boundary can only be within a 4-byte range.
This produces excessive code for e.g. str.floor_char_boundary(20), because the loop is unrolled.
https://godbolt.org/z/5f3YsM6oK
This PR replaces the loop with 3 checks in series.
In addition to reducing code size in some cases, it also removes bounds checks from calling code when following floor_char_boundary with a call to String::truncate (which I assume might be a common pattern).
Notes
- I tried using
index.unchecked_sub(1), but found it doesn't remove bounds checks, whereasindex.checked_sub(1).unwrap_unchecked()does. Surprising! - The
assert_uncheckeds are required to elide checks from code following thefloor_char_boundarycall e.g.:
let index = string.floor_char_boundary(20); string.truncate(index);
If this PR is accepted, I'll follow up with a similar PR for ceil_char_boundary.
Very happy to receive feedback and amend.