Optimize Wtf8Buf::into_string
for the case where it contains UTF-8. by sunfishcode · Pull Request #96869 · rust-lang/rust (original) (raw)
Add a is_known_utf8
flag to Wtf8Buf
, which tracks whether the
string is known to contain UTF-8. This is efficiently computed in many
common situations, such as when a Wtf8Buf
is constructed from a String
or &str
, or with Wtf8Buf::from_wide
which is already doing UTF-16
decoding and already checking for surrogates.
This makes OsString::into_string
O(1) rather than O(N) on Windows in
common cases.
And, it eliminates the need to scan through the string for surrogates inArgs::next
and Vars::next
, because the strings are already being
translated with Wtf8Buf::from_wide
.
Many things on Windows construct OsString
s with Wtf8Buf::from_wide
,
such as DirEntry::file_name
and fs::read_link
, so with this patch,
users of those functions can subsequently call .into_string()
without
paying for an extra scan through the string for surrogates.
r? @ghost