Suppression Comments · astral-sh/ruff · Discussion #6338 (original) (raw)
This discussion proposes a design for suppression-comments in the formatter and documents the decision for the future.
Black
Black supports fmt: off
followed by an optional fmt: on
own-line comments. Anything between does not get formatted:
Black reformats entire files in place. It doesn’t reformat lines that end with # fmt: skip or blocks that start with # fmt: off and end with # fmt: on. # fmt: on/off must be on the same level of indentation and in the same block, meaning no unindents beyond the initial indentation level between them. It also recognizes YAPF’s block comments to the same effect as a courtesy for straddling code. [source]
if outer: if test: # fmt:off does = not_get + formatted # fmt:on print("gets formatted")
The formatting gets automatically re-enabled if there's a dedent decreasing the indention level below the initial indentation of the fmt: off
comment. The following is semantically identical to the above example:
if outer: if test: # fmt:off does = not_get + formatted print("gets formatted")
The indentation behavior doesn't apply to parenthesized expressions (any expression enclosed by a ()
, []
, or {}
pair). Instead, Black re-enables the formatting when reaching the opening parentheses, meaning the following two examples are semantically equivalent:
```python a = ( # fmt: off does + not_get + formatted + # fmt: on ) + but + this + does
a = ( # fmt: off does + not_get + formatted + ) + but + this + does
Black further supports trailing fmt: skip
comments that disable formatting for the preceding line. I haven't been able to figure out fully what the logic is when using inside of expressions. Here a few examples:
Doesn't format anything except that it adds two spaces before the comment
if ( aaaa < True + c): # fmt:skip bbb + c
Formats everything, treads fmt:skip as a regular trailing comment?
if ( aaaa < True # fmt:skip + c): bbb + c
Formats content except the line annotated with fmt:skip
if ( aaaa < (aaa + b) # fmt:skip + c ): bbb + c
Proposal
Implementing Black's very flexible fmt: off
..fmt: on
suppression comments is challenging with our formatter architecture, and I also don't think it is necessary. I analyzed the usages of our ecosystem projects, and the vast majority only use statement-level suppression comments. These are easy to reason about and almost always sufficient.
Suppression comments are also a sign that the formatter isn't good enough. I want to focus on improving formatting rather than investing in a very flexible and powerful suppression system. Suppression comments don't just mean that we're back to discussing how the suppressed code should be formatted, but the suppression also decreases readability overall.
What's important from my perspective is that:
- You can intuitively tell which code gets formatted and how the suppressed code interacts with the enclosing formatted code (how the formatted file ultimately looks like)
- They support use cases like generated code
- You can suppress any other code with reasonable effort (it may require assigning to temporary variables)
Own line fmt: off
and fmt: on
comments
fmt: off
and fmt: on
comments allow to disable formatting for a range of statements. The statements must have the same or higher indentation level than the statement annotated with the fmt: off
comment. The fmt: on
is optional, but its best practice to use it.
I propose to limit fmt: off
... and fmt: on
to statement level only. The following examples are all valid
a = b
fmt: off
not_formatted = 3 more
fmt:on
if True:
fmt: off
suite if and_nested: not_formatted
print("formatted")
However, the following examples are not valid (but are valid in black)
if True: pass
fmt: off
elif False: pass
@Test
fmt: off
@Test2 def Abcd(): pass
- The
if..elif
case doesn't fulfill the goal that suppression comments should be intuitive. Does it suppress theelif
body only or all statements with the same indent as theelif
case? - The decorator case has the same problem. Does it disable the formatting only for the decorators, the decorators and the function signature, the decorators and the function, or the decorators, function, and any statements following after the function with the same indent?
That's why I believe fmt: skip
is useful
Trailing fmt: skip
or fmt: off
comments
A trailing fmt: skip
or fmt: off
comment suppresses formatting for the preceding statement, case header, decorator, function definition, or class definition:
if True: pass elif False: # fmt: skip pass
@Test @Test2 # fmt: off def test(): ...
a = [1, 2, 3, 4, 5] # fmt: off
def test(a, b, c, d, e, f) -> int: # fmt: skip pass
There are two caveats where the use of fmt: skip
is not intuitive:
a; b; c # fmt: skip
def test(): pass # fmt: skip
a; b; c # fmt skip
suppresses the formatting for the statementc
, and not the whole physical line as one might expect. This is because the AST has no sequence statement.def test(): pass # fmt: skip
suppresses formatting for the whole function and not just thepass
statement.
I think this is a shortcoming that we have to accept.
Unsupported Use cases
I went through all fmt: off
and fmt: skip
usages to identify the uses that either require changing the suppression comments or are entirely unsupported.
Between decorators
huggingface:transformers/src/transformers/models/gpt2/modeling_gpt2.py 1516- 1517- # Initialize weights and apply final processing 1518- self.post_init() 1519- 1520- @add_start_docstrings_to_model_forward(GPT2_INPUTS_DOCSTRING) 1521: # fmt: off 1522- @add_code_sample_docstrings( 1523- checkpoint="brad1141/gpt2-finetuned-comp2", 1524- output_type=TokenClassifierOutput, 1525- config_class=_CONFIG_FOR_DOC, 1526- expected_loss=0.25,
-> Use a trailing fmt: skip
Inside Lists
752- vocab += [(piece.piece, piece.score) for piece in proto.pieces[3:]] 753- vocab += [ 754: # fmt: off 755- ('ace_Arab', 0.0), ('ace_Latn', 0.0), ('acm_Arab', 0.0), ('acq_Arab', 0.0), ('aeb_Arab', 0.0), ('afr_Latn', 0.0), ('ajp_Arab', 0.0), ('aka_Latn', 0.0), ('amh_Ethi', 0.0), ('apc_Arab', 0.0), ('arb_Arab', 0.0), ('ars_Arab', 0.0), ('ary_Arab', 0.0), ('arz_Arab', 0.0), ('asm_Beng', 0.0), ('ast_Latn', 0.0), ('awa_Deva', 0.0), ('ayr_Latn', 0.0), ('azb_Arab', 0.0), ('azj_Latn', 0.0), ('bak_Cyrl', 0.0), ('bam_Latn', 0.0), ('ban_Latn', 0.0), ('bel_Cyrl', 0.0), ('bem_Latn', 0.0), ... 756- # fmt: on 757- ]
jim22k:python-graphblas/graphblas/core/operator/unary.py 334- UINT64, 335- ] 336- if _supports_complex: 337- position_dtypes.extend([FC32, FC64]) 338- for names, *types in [ 339: # fmt: off 340- ( 341- ( 342- "erf", "erfc", "lgamma", "tgamma", "acos", "acosh", "asin", "asinh", 343- "atan", "atanh", "ceil", "cos", "cosh", "exp", "exp2", "expm1", "floor", 344- "log", "log10", "log1p", "log2", "round", "signum", "sin", "sinh", "sqrt",
jim22k:python-graphblas/graphblas/core/operator/semiring.py 409- ] 410- if _supports_complex: 411- position_dtypes.extend([FC32, FC64]) 412- notbool_dtypes.extend([FC32, FC64]) 413- for lnames, rnames, *types in [ 414: # fmt: off 415- ( 416- ("any", "max", "min", "plus", "times"), 417- ( 418- "firsti", "firsti1", "firstj", "firstj1", 419- "secondi", "secondi1", "secondj", "secondj1",
jim22k:python-graphblas/graphblas/core/operator/binary.py 743- UINT64, 744- ] 745- if _supports_complex: 746- position_dtypes.extend([FC32, FC64]) 747- name_types = [ 748: # fmt: off 749- ( 750- ("atan2", "copysign", "fmod", "hypot", "ldexp", "remainder"), 751- ((BOOL, INT8, INT16, UINT8, UINT16), FP32), 752- ((INT32, INT64, UINT32, UINT64), FP64), 753- ),
Either use fmt: skip
, or assign the list to a temporary variable and suppress formatting for the assignment statement.
Inside assertions
huggingface:transformers/tests/models/mbart50/test_tokenization_mbart50.py 84- ) 85- 86- tokens = tokenizer.tokenize("I was born in 92000, and this is falsé.") 87- self.assertListEqual( 88- tokens, 89: # fmt: off 90- [SPIECE_UNDERLINE + "I", SPIECE_UNDERLINE + "was", SPIECE_UNDERLINE + "b", "or", "n", SPIECE_UNDERLINE + "in", SPIECE_UNDERLINE + "", "9", "2", "0", "0", "0", ",", SPIECE_UNDERLINE + "and", SPIECE_UNDERLINE + "this", SPIECE_UNDERLINE + "is", SPIECE_UNDERLINE + "f", "al", "s", "é", "."], 91- # fmt: on 92- ) 93- ids = tokenizer.convert_tokens_to_ids(tokens) 94- self.assertListEqual(
100- )
78- self.assertListEqual( 79- tokens, 80: # fmt: off 81- ['I', 'Ġwas', 'Ġborn', 'Ġin', 'Ġ9', '2000', ',', 'Ġand', 'Ġ', 'this', 'Ġis', 'Ġfals', 'é', '.'], 82- # fmt: on 83- )
I've seen both fmt: off
comments on the statement and expression level. Disallowing fmt: off
on the expression level increases consistency.
If Conditions
capyloon:gecko-b2g/third_party/python/pip/pip/_internal/commands/install.py 324- target_temp_dir_path: Optional[str] = None 325- if options.target_dir: 326- options.ignore_installed = True 327- options.target_dir = os.path.abspath(options.target_dir) 328- if ( 329: # fmt: off 330- os.path.exists(options.target_dir) and 331- not os.path.isdir(options.target_dir) 332- # fmt: on 333- ): 334- raise CommandError(
Inside dictionaries
huggingface:transformers/tests/pipelines/test_pipelines_token_classification.py 501- "word": "##zo", 502- "start": 2, 503- "end": 4, 504- }, 505- { 506: # fmt: off 507- "scores": np.array([0, 0, 0, 0.9986497163772583, 0]), 508- # fmt: on 509- "index": 7, 510- "word": "UN", 511- "is_subword": False,
557- "word": "##zo", 558- "start": 2, 559- "end": 4, 560- }, 561- { 562: # fmt: off 563- "scores": np.array([0, 0, 0, 0, 0, 0.9986497163772583, 0, 0, ]), 564- # fmt: on 565- "index": 7, 566- "word": "UN", 567- "is_subword": False,
This seems useful but won't be supported anymore.
Observation
The main use case for suppression comments is when it comes to formatting lists. We should improve list formatting by e.g. using a fill layout when all expressions in a list are simple expressions (numbers, simple strings). The fill layout tries to fit as many items as possible on a line rather than putting each item on its own line.
Open Question
- Should we enforce/warn about missing
fmt: on
comments? - Is divering on the suppression comment an adoption blocker?