ACP: Add FromByteStr
trait with blanket impl FromStr
· Issue #287 · rust-lang/libs-team (original) (raw)
Proposal
Problem statement
Many data forms that can be parsed from a string representation do not need UTF-8. Here, FromStr
is unnecessarily restrictive because a byte slice &[u8]
cannot be parsed directly. Instead, a UTF-8 &str
must be constructed to use .parse()
.
This is inconvenient when working with any raw buffers where one cannot assume that str::from_utf8
will be successful, nor is there any reason to incur the UTF-8 verification overhead. An example is IP addresses, for which there is an unstable from_bytes
function: rust-lang/rust#101035
Motivating examples or use cases
Any input data where UTF-8 cannot be guaranteed: stdin
, file paths, data from Read
, network packets, no_std
without UTF tables, any data read one byte at a time, etc.
Any output data that doesn't require specific knowledge of UTF-8: integers, floating point, IP/socket addresses, MAC addresses, UUIDs, etc.
Solution sketch
Add a trait that mirrors FromStr
but works with &[u8]
byte slices:
// Likely located in core::slice pub trait FromByteStr: Sized { type Err;
// Required method
fn from_byte_str(bytes: &[u8]) -> Result<Self, Self::Err>;
}
This will get a corresponding parse
on &[u8]
impl<[u8]> { pub fn parse(&self) -> Result<F, ::Err> where F: FromByteStr { /* ... */ } }
Since &str
is necessarily represented as &[u8]
, we can provide a blanket impl so no types need to be duplicated:
impl FromStr for T where T: FromByteStr { type Err = T::Err; fn from_str(s: &str) -> Result<Self, Self::Err> { s.as_bytes().parse() } }
If this is done, almost all types in std
that implement FromStr
will be able to switch to a FromByteStr
implementation:
- integer and floating point types
NonZeroX
IpAddr
andSocketAddr
(Tracking Issue for addr_parse_ascii feature rust#101035)
Alternatives
- Just use
TryFrom
- this was decided against in the case ofIpAddr
, see Support parsing IP addresses from a byte string rust#94890 (comment) - Name this
FromBytes::from_bytes
:from_byte_str
is proposed instead to make this clear that this parses text-encoded data, as opposed to binary serialization (ACP: Add FromByteStr trait with blanket impl FromStr #287 (comment) and ACP: Add FromByteStr trait with blanket impl FromStr #287 (comment)) - Name this
FromAscii
and place it in std::ascii: if this name were selected, users may expect this to parse something like [ascii::Char; N] rather than&[u8]
. I don't think we want this since&[u8] -> &[ascii::Char]
requires a validation step, and most implementations should be able to just raise an error if input is invalid.
Open Questions
- Should the return type and possibly
Err
be able to reference the source bytes? (ACP: Add FromByteStr trait with blanket impl FromStr #287 (comment) and its followup) - Should we name the associated type
Error
rather thanErr
? This isn't consistent withFromStr
but is more consistent withTryFrom
andTryInto
, as well as the rest of the ecosystem (whereErr
is usually onlyResult::Err
,Error
is an error type)
Links and related work
IpAddr::from_bytes
Tracking Issue for addr_parse_ascii feature rust#101035 and its pre-implementation discussion Support parsing IP addresses from a byte string rust#94890- ASCII character type discussion Add an "ascii character" type to reduce unsafe needs in conversions #179
What happens now?
This issue contains an API change proposal (or ACP) and is part of the libs-api team feature lifecycle. Once this issue is filed, the libs-api team will review open proposals as capability becomes available. Current response times do not have a clear estimate, but may be up to several months.
Possible responses
The libs team may respond in various different ways. First, the team will consider the problem (this doesn't require any concrete solution or alternatives to have been proposed):
- We think this problem seems worth solving, and the standard library might be the right place to solve it.
- We think that this probably doesn't belong in the standard library.
Second, if there's a concrete solution:
- We think this specific solution looks roughly right, approved, you or someone else should implement this. (Further review will still happen on the subsequent implementation PR.)
- We're not sure this is the right solution, and the alternatives or other materials don't give us enough information to be sure about that. Here are some questions we have that aren't answered, or rough ideas about alternatives we'd want to see discussed.