Improve print_tts by nnethercote · Pull Request #114571 · rust-lang/rust (original) (raw)

You haven't answered the following questions yet.

It would be helpful if you did.

I don't think FollowedBy is a right model for this.

What is the right model? You've made two suggestions:

This is a PR mostly about pretty-printing of parsed code, but your suggestions don't relate to that.

The "jointness" is an inherent property attached to a single token (tree) rather than to a sequence of token trees.

I don't understand how FollowedBy is particularly different. Both are attached to single tokens. Both describe a relationship with the subsequent token tree.

Token streams can be arbitrarily modified by users and with the new scheme the real stream can go out of sync with FollowedBy values even more easily than with the previous Spacing.

Arbitrarily modified how? Token streams can be modified in proc macros but FollowedBy isn't used there, Spacing is. How else can token streams be arbitrarily modified by users?

I think the matklad's scheme in #76285 was fine

It caused a regression.

when parsing a token stream we just need to track whether a token is immediately followed by something or not, and then "censor" that information at proc macro boundary to match the public facing behavior documented in #114617.

That's exactly what this PR does. The three-value FollowedBy enum tracks what a token is followed by, and then it is "censored" at the proc macro boundary by converting it to the less expressive Spacing. FollowedBy is also used to improve pretty-printing.

I'd ideally first land a version of #76285 first, without any pretty-printing changes or renamings, but with regressions fixed. Not sure how exactly the regression was fixed in this PR, but I assume it doesn't strictly require introducing a third variant?

The problem in #76285 was the addition of code in from_internal that changed Joint markers to Alone if they weren't followed by a TokenTree::Token that satisfies is_op. This included the final token in a stream.

My original code had a similar check, causing the same problem. I fixed it by moving the is_op check from from_internal back into parse_token_trees, and introducing the three values in FollowedBy. I even added a comment warning others from adding a check to from_internal, because both matklad and I had made that mistake.

As for pretty-printing streams from proc macros, I think we can extend the internal spacing to enum Spacing { Joint, Alone, Unspecified } where Unspecified comes from non-Punct tokens created by proc macros. In the Unspecified case pretty-printing can fall back to heuristics.

What about pretty-printing streams from parsed code? The whole point of this PR is to improve that case. #76285 didn't affect that case. Your original suggestion on Zulip was "it would be preferable to preserve jointness for all tokens and use it instead during pretty-printing." That is what this PR does. You said that FollowedBy isn't the right model, but you haven't suggested an alternative that would improve pretty-printing from parsed code. The Joint/Alone model used for proc macros won't suffice, because Alone covers both "followed by whitespace" and "followed by non-punct", and better pretty-printing requires those cases be distinguished. Your suggested Joint/Alone/Unspecified also won't help that case.

FollowedBy works because it (a) has enough information to improve pretty-printing of parse code, and (b) downgrades perfectly to Spacing for proc macros.