rustdoc_json: Intern filenames by nnethercote · Pull Request #142945 · rust-lang/rust (original) (raw)
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Conversation6 Commits1 Checks0 Files changed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
[ Show hidden characters]({{ revealButtonHref }})
So that each unique filename only occurs once in the JSON output. This reduces the JSON file size by about 9%.
So that each unique filename only occurs once in the JSON output. This reduces the JSON file size by about 9%.
rustdoc-json-types is a public (although nightly-only) API. If possible, consider changing src/librustdoc/json/conversions.rs; otherwise, make sure you bump the FORMAT_VERSION constant.
cc @CraftSpider, @aDotInTheVoid, @Enselic, @obi1kenobi
This does make the output very slightly harder to work with, because filenames require a lookup instead of being directly available. But it's kind of a no-brainer in terms of file size, and for that reason is mentioned in the 2023 roadmap (#106697). It also slightly reduces peak memory usage.
There is currently a couple of tests failing. They need #142479.
I also have an implementation of String interning, though I haven't filed a PR yet. It's very similar to this PR.
aDotInTheVoid added S-blocked
Status: Blocked on something else such as an RFC or other implementation work.
and removed S-waiting-on-review
Status: Awaiting review from the assignee but also interested parties.
labels
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rustdoc JSON format changes LGTM. Thank you!
Comment on lines -396 to -398
| /// Rustdoc makes no guarantees about the inner value of Id's. Applications |
|---|
| /// should treat them as opaque keys to lookup items, and avoid attempting |
| /// to parse them, or otherwise depend on any implementation details. |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth keeping this portion of the comment?
On one hand, this comment used to be a lot more useful when the ID was of the form 2:1234 (more or less) where 2 was the crate ID and the rest could be (ab)used as a DefId inside that crate with better than 50-50 odds of working out. On the other hand, the ID values may change representation again in the future, and this is a solid future-proofing warning.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for explaining how it used to be different, now I understand how the comment came to be. It used to make sense, but no more. The id is just an index into another part of the JSON. That's not an "opaque key" in any sense. And the idea of parsing an integer is silly.
As for future-proofing: sure, this could change, but so could literally anything else in the representation. So it doesn't need explicit mention.
Labels
Area: Rustdoc JSON backend
Status: Blocked on something else such as an RFC or other implementation work.
Relevant to the rustdoc team, which will review and decide on the PR/issue.