BStr
and BString
transparent wrapper types for probably-human-readable text · Issue #502 · rust-lang/libs-team (original) (raw)
Proposal
Problem statement
Support users working with byte vectors/slices that are usually human-readable and usually UTF-8, but aren't guaranteed to be. Provide a common set of vocabulary types that users can use, without requiring them to go to the crates ecosystem.
Motivating examples or use cases
If you need to work with "usually but not always human-readable data", and you use Vec<u8>
or &[u8]
, and you print those, you get very long strings like [97, 98, 99, ...
, rather than "abc..."
. A Vec<u8>
doesn't tell the type system that you generally expect human-readable text.
Having separate types allows having different trait impls, for purposes such as Debug, serialize/deserialize traits, database query traits, etc.
The extremely popular bstr
crate, with numerous downloads and numerous crates depending on it, provides evidence of the desirability of these types.
Embedding a BStr
or BString
in a data structure provides type-level information that the user expects it to be human readable rather than arbitrary bytes.
A few of the many potential use cases:
- Handling strings from stdin/stdout of a (potentially remote) command, which are probably human-readable but they might have human-supplied non-UTF-8 that should be emitted as-is.
- Implementations of tools like git, which need portable paths (not the paths of the local OS) that are always bytes.
- Implementations of text processing tools that assume the data is human-readable.
Solution sketch
// In core::bstr #[repr(transparent)] pub struct BStr(pub [u8]);
impl Debug for BStr { ... } impl Display for BStr { ... } impl Deref for BStr { type Target = [u8]; ... } impl DerefMut for BStr { ... } // Other trait impls from bstr, including From impls
// In alloc::bstr #[repr(transparent)] pub struct BString(pub Vec);
impl Debug for BString { ... } impl Display for BString { ... } impl Deref for BString { type Target = Vec; ... } impl DerefMut for BString { ... } // Other trait impls from bstr, including From impls
This ACP is not proposing to add any inherent methods. The bstr types have no inherent methods except new
, and this ACP is proposing to make the types transparent wrappers instead.
#499 provides the relevant methods, and those methods will be accessible via Deref
/DerefMut
. This is similar to the approach bstr uses.
Alternatives
The primary alternative would be to continue using a crate from crates.io. I would propose that these types are simple enough and fundamental enough that they should be in the standard library as "vocabulary types".
What happens now?
This issue contains an API change proposal (or ACP) and is part of the libs-api team feature lifecycle. Once this issue is filed, the libs-api team will review open proposals as capability becomes available. Current response times do not have a clear estimate, but may be up to several months.
Possible responses
The libs team may respond in various different ways. First, the team will consider the problem (this doesn't require any concrete solution or alternatives to have been proposed):
- We think this problem seems worth solving, and the standard library might be the right place to solve it.
- We think that this probably doesn't belong in the standard library.
Second, if there's a concrete solution:
- We think this specific solution looks roughly right, approved, you or someone else should implement this. (Further review will still happen on the subsequent implementation PR.)
- We're not sure this is the right solution, and the alternatives or other materials don't give us enough information to be sure about that. Here are some questions we have that aren't answered, or rough ideas about alternatives we'd want to see discussed.