Clarify str::from_utf8_unchecked's invariants by CAD97 · Pull Request #95895 · rust-lang/rust (original) (raw)

This is true, but this is perhaps even handled just by if the validity of &T is independent from whether the pointee bytes are valid at type T.

In general, from_unchecked functions do just say that it is UB to provide an argument which does not satisfy the safety invariant. This is useful, because it allows a sanitizing implementation of the function which checks the precondition.

A postcondition of "is not used in any way which causes UB" is much more difficult to reason about.

Another point is that str has the option of using &*(v as *const [u8] as *const str) to construct a &str to invalid-UTF-8. String doesn't have any such API, relying on conversion to/from Vec<u8>.

If any str/String methods actually documented that they were safe to call with a reference to invalid UTF-8, then this weaker documentation requirement makes sense. As is, the only possible thing to do with a str/String to invalid UTF-8 is to forget it. With that the case, the clearer precondition seems better to use.