BStr and BString transparent wrapper types for probably-human-readable text · Issue #502 · rust-lang/libs-team (original) (raw)

Proposal

Problem statement

Support users working with byte vectors/slices that are usually human-readable and usually UTF-8, but aren't guaranteed to be. Provide a common set of vocabulary types that users can use, without requiring them to go to the crates ecosystem.

Motivating examples or use cases

If you need to work with "usually but not always human-readable data", and you use Vec<u8> or &[u8], and you print those, you get very long strings like [97, 98, 99, ..., rather than "abc...". A Vec<u8> doesn't tell the type system that you generally expect human-readable text.

Having separate types allows having different trait impls, for purposes such as Debug, serialize/deserialize traits, database query traits, etc.

The extremely popular bstr crate, with numerous downloads and numerous crates depending on it, provides evidence of the desirability of these types.

Embedding a BStr or BString in a data structure provides type-level information that the user expects it to be human readable rather than arbitrary bytes.

A few of the many potential use cases:

Solution sketch

// In core::bstr #[repr(transparent)] pub struct BStr(pub [u8]);

impl Debug for BStr { ... } impl Display for BStr { ... } impl Deref for BStr { type Target = [u8]; ... } impl DerefMut for BStr { ... } // Other trait impls from bstr, including From impls

// In alloc::bstr #[repr(transparent)] pub struct BString(pub Vec);

impl Debug for BString { ... } impl Display for BString { ... } impl Deref for BString { type Target = Vec; ... } impl DerefMut for BString { ... } // Other trait impls from bstr, including From impls

This ACP is not proposing to add any inherent methods. The bstr types have no inherent methods except new, and this ACP is proposing to make the types transparent wrappers instead.

#499 provides the relevant methods, and those methods will be accessible via Deref/DerefMut. This is similar to the approach bstr uses.

Alternatives

The primary alternative would be to continue using a crate from crates.io. I would propose that these types are simple enough and fundamental enough that they should be in the standard library as "vocabulary types".

What happens now?

This issue contains an API change proposal (or ACP) and is part of the libs-api team feature lifecycle. Once this issue is filed, the libs-api team will review open proposals as capability becomes available. Current response times do not have a clear estimate, but may be up to several months.

Possible responses

The libs team may respond in various different ways. First, the team will consider the problem (this doesn't require any concrete solution or alternatives to have been proposed):

Second, if there's a concrete solution: