Interoperability - Rust API Guidelines (original) (raw)

Rust API Guidelines

Interoperability

Types eagerly implement common traits (C-COMMON-TRAITS)

Rust's trait system does not allow orphans: roughly, every impl must live either in the crate that defines the trait or the implementing type. Consequently, crates that define new types should eagerly implement all applicable, common traits.

To see why, consider the following situation:

Crate std defines trait Display.
Crate url defines type Url, without implementing Display.
Crate webapp imports from both std and url,

There is no way for webapp to add Display to Url, since it defines neither. (Note: the newtype pattern can provide an efficient, but inconvenient workaround.)

The most important common traits to implement from std are:

Note that it is common and expected for types to implement bothDefault and an empty new constructor. new is the constructor convention in Rust, and users expect it to exist, so if it is reasonable for the basic constructor to take no arguments, then it should, even if it is functionally identical to default.

Conversions use the standard traits From, AsRef, AsMut (C-CONV-TRAITS)

The following conversion traits should be implemented where it makes sense:

The following conversion traits should never be implemented:

Into
TryInto

These traits have a blanket impl based on From and TryFrom. Implement those instead.

Examples from the standard library

From<u16> is implemented for u32 because a smaller integer can always be converted to a bigger integer.
From<u32> is not implemented for u16 because the conversion may not be possible if the integer is too big.
TryFrom<u32> is implemented for u16 and returns an error if the integer is too big to fit in u16.
From is implemented for IpAddr, which is a type that can represent both v4 and v6 IP addresses.

Collections implement FromIterator and Extend (C-COLLECT)

FromIterator and Extend enable collections to be used conveniently with the following iterator methods:

FromIterator is for creating a new collection containing items from an iterator, and Extend is for adding items from an iterator onto an existing collection.

Examples from the standard library

Vec implements both FromIterator<T> and Extend<T>.

Data structures implement Serde's Serialize, Deserialize (C-SERDE)

Types that play the role of a data structure should implement Serialize andDeserialize.

There is a continuum of types between things that are clearly a data structure and things that are clearly not, with gray area in between. LinkedHashMapand IpAddr are data structures. It would be completely reasonable for somebody to want to read in a LinkedHashMap or IpAddr from a JSON file, or send one over IPC to another process. LittleEndian is not a data structure. It is a marker used by the byteorder crate to optimize at compile time for bytes in a particular order, and in fact an instance of LittleEndian can never exist at runtime. So these are clear-cut examples; the #rust or #serde IRC channels can help assess more ambiguous cases if necessary.

If a crate does not already depend on Serde for other reasons, it may wish to gate Serde impls behind a Cargo cfg. This way downstream libraries only need to pay the cost of compiling Serde if they need those impls to exist.

For consistency with other Serde-based libraries, the name of the Cargo cfg should be simply "serde". Do not use a different name for the cfg like"serde_impls" or "serde_serialization".

The canonical implementation looks like this when not using derive:

[dependencies]
serde = { version = "1.0", optional = true }


#![allow(unused)]
fn main() {
pub struct T { /* ... */ }

#[cfg(feature = "serde")]
impl Serialize for T { /* ... */ }

#[cfg(feature = "serde")]
impl<'de> Deserialize<'de> for T { /* ... */ }
}

And when using derive:

[dependencies]
serde = { version = "1.0", optional = true, features = ["derive"] }


#![allow(unused)]
fn main() {
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
pub struct T { /* ... */ }
}

Types are Send and Sync where possible (C-SEND-SYNC)

Send and Sync are automatically implemented when the compiler determines it is appropriate.

In types that manipulate raw pointers, be vigilant that the Send and Syncstatus of your type accurately reflects its thread safety characteristics. Tests like the following can help catch unintentional regressions in whether the type implements Send or Sync.


#![allow(unused)]
fn main() {
#[test]
fn test_send() {
    fn assert_send<T: Send>() {}
    assert_send::<MyStrangeType>();
}

#[test]
fn test_sync() {
    fn assert_sync<T: Sync>() {}
    assert_sync::<MyStrangeType>();
}
}

Error types are meaningful and well-behaved (C-GOOD-ERR)

An error type is any type E used in a Result<T, E> returned by any public function of your crate. Error types should always implement thestd::error::Error trait which is the mechanism by which error handling libraries like error-chain abstract over different types of errors, and which allows the error to be used as the source() of another error.

Additionally, error types should implement the Send and Sync traits. An error that is not Send cannot be returned by a thread run withthread::spawn. An error that is not Sync cannot be passed across threads using an Arc. These are common requirements for basic error handling in a multithreaded application.

Send and Sync are also important for being able to package a custom error into an IO error using std::io::Error::new, which requires a trait bound ofError + Send + Sync.

One place to be vigilant about this guideline is in functions that return Error trait objects, for example reqwest::Error::get_ref. Typically Error + Send + Sync + 'static will be the most useful for callers. The addition of'static allows the trait object to be used with Error::downcast_ref.

Never use () as an error type, even where there is no useful additional information for the error to carry.

() does not implement Error so it cannot be used with error handling libraries like error-chain.
() does not implement Display so a user would need to write an error message of their own if they want to fail because of the error.
() has an unhelpful Debug representation for users that decide tounwrap() the error.
It would not be semantically meaningful for a downstream library to implementFrom<()> for their error type, so () as an error type cannot be used with the ? operator.

Instead, define a meaningful error type specific to your crate or to the individual function. Provide appropriate Error and Display impls. If there is no useful information for the error to carry, it can be implemented as a unit struct.


#![allow(unused)]
fn main() {
use std::error::Error;
use std::fmt::Display;

// Instead of this...
fn do_the_thing() -> Result<Wow, ()>

// Prefer this...
fn do_the_thing() -> Result<Wow, DoError>

#[derive(Debug)]
struct DoError;

impl Display for DoError { /* ... */ }
impl Error for DoError { /* ... */ }
}

The error message given by the Display representation of an error type should be lowercase without trailing punctuation, and typically concise.

Error::description() should not be implemented. It has been deprecated and users should always use Display instead of description() to print the error.

Examples from the standard library

ParseBoolError is returned when failing to parse a bool from a string.

Examples of error messages

"unexpected end of file"
"provided string was not `true` or `false`"
"invalid IP address syntax"
"second time provided was later than self"
"invalid UTF-8 sequence of {} bytes from index {}"
"environment variable was not valid unicode: {:?}"

Binary number types provide Hex, Octal, Binary formatting (C-NUM-FMT)

These traits control the representation of a type under the {:X}, {:x},{:o}, and {:b} format specifiers.

Implement these traits for any number type on which you would consider doing bitwise manipulations like | or &. This is especially appropriate for bitflag types. Numeric quantity types like struct Nanoseconds(u64) probably do not need these.

Generic reader/writer functions take R: Read and W: Write by value (C-RW-VALUE)

The standard library contains these two impls:


#![allow(unused)]
fn main() {
impl<'a, R: Read + ?Sized> Read for &'a mut R { /* ... */ }

impl<'a, W: Write + ?Sized> Write for &'a mut W { /* ... */ }
}

That means any function that accepts R: Read or W: Write generic parameters by value can be called with a mut reference if necessary.

In the documentation of such functions, briefly remind users that a mut reference can be passed. New Rust users often struggle with this. They may have opened a file and want to read multiple pieces of data out of it, but the function to read one piece consumes the reader by value, so they are stuck. The solution would be to leverage one of the above impls and pass &mut f instead of f as the reader parameter.