Add `relnotes-api-list` in-tree tool by pietroalbini · Pull Request #143053 · rust-lang/rust (original) (raw)

⚠️ This PR is not ready yet. Opening it up for early review. ⚠️

Add step to verify all the generated URLs are correct, possibly using linkchecker.
Figure out why some items are missing from the JSON output.
Add information about const stabilizations.
Make a decision on the outstanding questions.

This PR adds a new in-tree tool called relnotes-api-list to generate a simplified JSON representation of the standard library API. This representation will be uploaded as a dist artifact (but it won't be included in the published releases), and it will be used by the relnotes tool to generate the "Stabilized APIs" section by comparing the JSON file of multiple releases.

Behind the scenes, the tool consumes the Rustdoc JSON (cc @rust-lang/rustdoc) and depends on the in-tree src/rustdoc-json-types crate. Being an in-tree tool, this implies PRs modifying the JSON output will also have to adapt this tool to pass CI.

The generated JSON contains a tree structure, with each node containing an item (module, struct, function, etc) and its children (for example, the methods of a struct). This tree representation will allow the relnotes tool (for example) to only show a line for a new module being added instead of also showing all of the structs, functions and methods within that new module.

Outstanding question: `impl` blocks

While deciding how to represent most items in the JSON is trivial (just the path to the item as the name, and the Rustdoc URL as the URL), impl blocks are trickier, and there is no single obvious solution to them.

The first problem is whether to include impl blocks at all:

An impl $type {} is just a container of methods and associated items, and should intuitively be "transparent". Whether an item is in one impl block or another doesn't influence the public API (as long as one of them is not cfg'd out). Because of this, the code just walks through these kinds of impl blocks without recording them in the resulting JSON (the children are still emitted as if they are children of the type).
An impl <math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>t</mi><mi>r</mi><mi>a</mi><mi>i</mi><mi>t</mi><mi>f</mi><mi>o</mi><mi>r</mi></mrow><annotation encoding="application/x-tex">trait for </annotation></semantics></math>traitfortype is the opposite, as the presence or not of the impl is load bearing, but the children of the impl are not relevant (we don't want a relnotes item for each type implementing a trait when a new trait method is added).

So, if we go with these assumptions, we only care about how to represent impl <math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>t</mi><mi>r</mi><mi>a</mi><mi>i</mi><mi>t</mi><mi>f</mi><mi>o</mi><mi>r</mi></mrow><annotation encoding="application/x-tex">trait for </annotation></semantics></math>traitfortype {} in the resulting JSON. This has two big open questions though:

What name do we put for the impl block? The intuitive answer there (and what I currently implemented in this PR) is to pretty-print the data contained in the Rustdoc JSON, which results in names like this:
impl<A, E, V: FromIterator
> FromIterator<Result<A, E>> for Result<V, E>
impl<T: $crate::hash::Hash> Hash for Option
The downsides of this approach are:
1. A lot of implementation complexity, and extra work when a new Rustdoc JSON version wants to land, since we need to exhaustively match a large surface of the Rustdoc JSON API. This is basically the whole purpose of the pretty_print.rs module.
2. Little control over what is rendered as the name, since the information comes straight from Rustdoc JSON. In those examples for example we can see relative paths and $crate. Resolving them in the tool would be overly complex IMO.
 The upside of the approach though is that each item is unambiguous, even when complex where clauses are used.
What URL do we put for the impl block? Right now I don't put any URL, as I'm not sure what approach to take. The problem is that the URLs for the impls also require walking through a bunch of the Rustdoc JSON information, and I'd need to spend even more time figuring out how to emit them correctly:

std/result/enum.Result.html#impl-FromIterator<Result<A,+E>>-for-Result<V,+E>  
std/option/enum.Option.html#impl-Hash-for-Option<T>

Outstanding question: move some of the behavior to Rustdoc JSON?

There is a lot of complexity right now in the tool that could maybe benefit from being uplifted in the Rustdoc JSON emitter:

Stability information: right now I need to walk the tree separately to determine the stability information when whole modules are marked as stable or unstable, and I'm not even sure if I implemented the logic correctly. Rustdoc already knows the stability of each item, and it would be nice to have it included in the Rustdoc JSON if the crate uses staged_api.
Item URLs: I'm kinda torn on whether I'd want URLs to be included in the Rustdoc JSON. On one hand, they will add a lot of bloat that some JSON users won't need. On the other hand, determining the URLs of items is fairly complex, especially for impls, and Rustdoc providing the correct URLs would be immensely helpful.

Q&A

Why is this an in-tree tool instead of consuming the Rustdoc JSON from the relnotes tool?
I initially tried to implement this directly in the relnotes tool, but quickly ran into the problem that most releases include a new version of the Rustdoc JSON. We'd then either have to implement our own data structures into the tool supporting multiple Rustdoc JSON versions (and having to figure out what changed in each release), or have multiple copies of the source code for each version. Being in-tree means it will be updated whenever a new Rustdoc JSON version happens.
Why not use cargo-public-api instead of reimplementing it in-tree?
cargo-public-api is very similar to what this PR implements, but it depends on the public rustdoc-types crate and only supports one version of Rustdoc JSON at a time. It suffers the same problems as this being an out-of-tree tool (see above), and we don't have a guarantee it will be updated to the new Rustdoc JSON version by the time we need to prepare the release notes.

r? @Mark-Simulacrum
cc @rust-lang/release

Add relnotes-api-list in-tree tool by pietroalbini · Pull Request #143053 · rust-lang/rust (original) (raw)

Outstanding question: impl blocks

Outstanding question: move some of the behavior to Rustdoc JSON?

Q&A

Add `relnotes-api-list` in-tree tool by pietroalbini · Pull Request #143053 · rust-lang/rust (original) (raw)

Outstanding question: `impl` blocks