Improve print_tts
by nnethercote · Pull Request #114571 · rust-lang/rust (original) (raw)
You haven't answered the following questions yet.
- Where does print_tt's output show up in practice? I see it in the output of stringify!, is there anywhere else?
- Could changes to the output break real-world code? The docs for stringify! explicitly state that the output might change, so I'm not too worried by that, but maybe there are other places it could affect.
It would be helpful if you did.
I don't think
FollowedBy
is a right model for this.
What is the right model? You've made two suggestions:
- Reinstate Move jointness censoring to proc_macro #76285, which was just a refactoring that didn't affect pretty-printing at all.
- Add
Spacing::Unspecified
, which might help pretty-printing of proc macro code slightly, but would have no effect on pretty-printing of parsed code.
This is a PR mostly about pretty-printing of parsed code, but your suggestions don't relate to that.
The "jointness" is an inherent property attached to a single token (tree) rather than to a sequence of token trees.
I don't understand how FollowedBy
is particularly different. Both are attached to single tokens. Both describe a relationship with the subsequent token tree.
Token streams can be arbitrarily modified by users and with the new scheme the real stream can go out of sync with
FollowedBy
values even more easily than with the previousSpacing
.
Arbitrarily modified how? Token streams can be modified in proc macros but FollowedBy
isn't used there, Spacing
is. How else can token streams be arbitrarily modified by users?
I think the matklad's scheme in #76285 was fine
It caused a regression.
when parsing a token stream we just need to track whether a token is immediately followed by something or not, and then "censor" that information at proc macro boundary to match the public facing behavior documented in #114617.
That's exactly what this PR does. The three-value FollowedBy
enum tracks what a token is followed by, and then it is "censored" at the proc macro boundary by converting it to the less expressive Spacing
. FollowedBy
is also used to improve pretty-printing.
I'd ideally first land a version of #76285 first, without any pretty-printing changes or renamings, but with regressions fixed. Not sure how exactly the regression was fixed in this PR, but I assume it doesn't strictly require introducing a third variant?
The problem in #76285 was the addition of code in from_internal
that changed Joint
markers to Alone
if they weren't followed by a TokenTree::Token
that satisfies is_op
. This included the final token in a stream.
My original code had a similar check, causing the same problem. I fixed it by moving the is_op
check from from_internal
back into parse_token_trees
, and introducing the three values in FollowedBy
. I even added a comment warning others from adding a check to from_internal
, because both matklad and I had made that mistake.
As for pretty-printing streams from proc macros, I think we can extend the internal spacing to
enum Spacing { Joint, Alone, Unspecified }
whereUnspecified
comes from non-Punct
tokens created by proc macros. In theUnspecified
case pretty-printing can fall back to heuristics.
What about pretty-printing streams from parsed code? The whole point of this PR is to improve that case. #76285 didn't affect that case. Your original suggestion on Zulip was "it would be preferable to preserve jointness for all tokens and use it instead during pretty-printing." That is what this PR does. You said that FollowedBy
isn't the right model, but you haven't suggested an alternative that would improve pretty-printing from parsed code. The Joint
/Alone
model used for proc macros won't suffice, because Alone
covers both "followed by whitespace" and "followed by non-punct", and better pretty-printing requires those cases be distinguished. Your suggested Joint
/Alone
/Unspecified
also won't help that case.
FollowedBy
works because it (a) has enough information to improve pretty-printing of parse code, and (b) downgrades perfectly to Spacing
for proc macros.