Add relnotes-api-list in-tree tool by pietroalbini · Pull Request #143053 · rust-lang/rust (original) (raw)
⚠️ This PR is not ready yet. Opening it up for early review. ⚠️
- Add step to verify all the generated URLs are correct, possibly using linkchecker.
- Figure out why some items are missing from the JSON output.
- Add information about const stabilizations.
- Make a decision on the outstanding questions.
This PR adds a new in-tree tool called relnotes-api-list to generate a simplified JSON representation of the standard library API. This representation will be uploaded as a dist artifact (but it won't be included in the published releases), and it will be used by the relnotes tool to generate the "Stabilized APIs" section by comparing the JSON file of multiple releases.
Behind the scenes, the tool consumes the Rustdoc JSON (cc @rust-lang/rustdoc) and depends on the in-tree src/rustdoc-json-types crate. Being an in-tree tool, this implies PRs modifying the JSON output will also have to adapt this tool to pass CI.
The generated JSON contains a tree structure, with each node containing an item (module, struct, function, etc) and its children (for example, the methods of a struct). This tree representation will allow the relnotes tool (for example) to only show a line for a new module being added instead of also showing all of the structs, functions and methods within that new module.
Outstanding question: impl blocks
While deciding how to represent most items in the JSON is trivial (just the path to the item as the name, and the Rustdoc URL as the URL), impl blocks are trickier, and there is no single obvious solution to them.
The first problem is whether to include impl blocks at all:
- An
impl $type {}is just a container of methods and associated items, and should intuitively be "transparent". Whether an item is in oneimplblock or another doesn't influence the public API (as long as one of them is notcfg'd out). Because of this, the code just walks through these kinds ofimplblocks without recording them in the resulting JSON (the children are still emitted as if they are children of the type). - An
impl <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>t</mi><mi>r</mi><mi>a</mi><mi>i</mi><mi>t</mi><mi>f</mi><mi>o</mi><mi>r</mi></mrow><annotation encoding="application/x-tex">trait for </annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8889em;vertical-align:-0.1944em;"></span><span class="mord mathnormal">t</span><span class="mord mathnormal" style="margin-right:0.02778em;">r</span><span class="mord mathnormal">ai</span><span class="mord mathnormal">t</span><span class="mord mathnormal" style="margin-right:0.10764em;">f</span><span class="mord mathnormal" style="margin-right:0.02778em;">or</span></span></span></span>typeis the opposite, as the presence or not of theimplis load bearing, but the children of theimplare not relevant (we don't want a relnotes item for each type implementing a trait when a new trait method is added).
So, if we go with these assumptions, we only care about how to represent impl <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>t</mi><mi>r</mi><mi>a</mi><mi>i</mi><mi>t</mi><mi>f</mi><mi>o</mi><mi>r</mi></mrow><annotation encoding="application/x-tex">trait for </annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8889em;vertical-align:-0.1944em;"></span><span class="mord mathnormal">t</span><span class="mord mathnormal" style="margin-right:0.02778em;">r</span><span class="mord mathnormal">ai</span><span class="mord mathnormal">t</span><span class="mord mathnormal" style="margin-right:0.10764em;">f</span><span class="mord mathnormal" style="margin-right:0.02778em;">or</span></span></span></span>type {} in the resulting JSON. This has two big open questions though:
- What name do we put for the impl block? The intuitive answer there (and what I currently implemented in this PR) is to pretty-print the data contained in the Rustdoc JSON, which results in names like this:
impl<A, E, V: FromIterator> FromIterator<Result<A, E>> for Result<V, E>
impl<T: $crate::hash::Hash> Hash for Option
The downsides of this approach are:- A lot of implementation complexity, and extra work when a new Rustdoc JSON version wants to land, since we need to exhaustively match a large surface of the Rustdoc JSON API. This is basically the whole purpose of the
pretty_print.rsmodule. - Little control over what is rendered as the name, since the information comes straight from Rustdoc JSON. In those examples for example we can see relative paths and
$crate. Resolving them in the tool would be overly complex IMO.
The upside of the approach though is that each item is unambiguous, even when complexwhereclauses are used.
- A lot of implementation complexity, and extra work when a new Rustdoc JSON version wants to land, since we need to exhaustively match a large surface of the Rustdoc JSON API. This is basically the whole purpose of the
- What URL do we put for the
implblock? Right now I don't put any URL, as I'm not sure what approach to take. The problem is that the URLs for theimpls also require walking through a bunch of the Rustdoc JSON information, and I'd need to spend even more time figuring out how to emit them correctly:
std/result/enum.Result.html#impl-FromIterator<Result<A,+E>>-for-Result<V,+E>
std/option/enum.Option.html#impl-Hash-for-Option<T> Outstanding question: move some of the behavior to Rustdoc JSON?
There is a lot of complexity right now in the tool that could maybe benefit from being uplifted in the Rustdoc JSON emitter:
- Stability information: right now I need to walk the tree separately to determine the stability information when whole modules are marked as stable or unstable, and I'm not even sure if I implemented the logic correctly. Rustdoc already knows the stability of each item, and it would be nice to have it included in the Rustdoc JSON if the crate uses staged_api.
- Item URLs: I'm kinda torn on whether I'd want URLs to be included in the Rustdoc JSON. On one hand, they will add a lot of bloat that some JSON users won't need. On the other hand, determining the URLs of items is fairly complex, especially for
impls, and Rustdoc providing the correct URLs would be immensely helpful.
Q&A
- Why is this an in-tree tool instead of consuming the Rustdoc JSON from the relnotes tool?
I initially tried to implement this directly in the relnotes tool, but quickly ran into the problem that most releases include a new version of the Rustdoc JSON. We'd then either have to implement our own data structures into the tool supporting multiple Rustdoc JSON versions (and having to figure out what changed in each release), or have multiple copies of the source code for each version. Being in-tree means it will be updated whenever a new Rustdoc JSON version happens. - Why not use cargo-public-api instead of reimplementing it in-tree?
cargo-public-api is very similar to what this PR implements, but it depends on the publicrustdoc-typescrate and only supports one version of Rustdoc JSON at a time. It suffers the same problems as this being an out-of-tree tool (see above), and we don't have a guarantee it will be updated to the new Rustdoc JSON version by the time we need to prepare the release notes.
r? @Mark-Simulacrum
cc @rust-lang/release