Suppression Comments · astral-sh/ruff · Discussion #6338 (original) (raw)

This discussion proposes a design for suppression-comments in the formatter and documents the decision for the future.

Black

Black supports fmt: off followed by an optional fmt: on own-line comments. Anything between does not get formatted:

Black reformats entire files in place. It doesn’t reformat lines that end with # fmt: skip or blocks that start with # fmt: off and end with # fmt: on. # fmt: on/off must be on the same level of indentation and in the same block, meaning no unindents beyond the initial indentation level between them. It also recognizes YAPF’s block comments to the same effect as a courtesy for straddling code. [source]

if outer: if test: # fmt:off does = not_get + formatted # fmt:on print("gets formatted")

The formatting gets automatically re-enabled if there's a dedent decreasing the indention level below the initial indentation of the fmt: off comment. The following is semantically identical to the above example:

if outer: if test: # fmt:off does = not_get + formatted print("gets formatted")

The indentation behavior doesn't apply to parenthesized expressions (any expression enclosed by a (), [], or {} pair). Instead, Black re-enables the formatting when reaching the opening parentheses, meaning the following two examples are semantically equivalent:

```python a = ( # fmt: off does + not_get + formatted + # fmt: on ) + but + this + does

a = ( # fmt: off does + not_get + formatted + ) + but + this + does

Black further supports trailing fmt: skip comments that disable formatting for the preceding line. I haven't been able to figure out fully what the logic is when using inside of expressions. Here a few examples:

Doesn't format anything except that it adds two spaces before the comment

if ( aaaa < True + c): # fmt:skip bbb + c

Formats everything, treads fmt:skip as a regular trailing comment?

if ( aaaa < True # fmt:skip + c): bbb + c

Formats content except the line annotated with fmt:skip

if ( aaaa < (aaa + b) # fmt:skip + c ): bbb + c

Proposal

Implementing Black's very flexible fmt: off..fmt: on suppression comments is challenging with our formatter architecture, and I also don't think it is necessary. I analyzed the usages of our ecosystem projects, and the vast majority only use statement-level suppression comments. These are easy to reason about and almost always sufficient.

Suppression comments are also a sign that the formatter isn't good enough. I want to focus on improving formatting rather than investing in a very flexible and powerful suppression system. Suppression comments don't just mean that we're back to discussing how the suppressed code should be formatted, but the suppression also decreases readability overall.

What's important from my perspective is that:

Own line fmt: off and fmt: on comments

fmt: off and fmt: on comments allow to disable formatting for a range of statements. The statements must have the same or higher indentation level than the statement annotated with the fmt: off comment. The fmt: on is optional, but its best practice to use it.

I propose to limit fmt: off... and fmt: on to statement level only. The following examples are all valid

a = b

fmt: off

not_formatted = 3 more

fmt:on

if True:

fmt: off

suite if and_nested: not_formatted

print("formatted")

However, the following examples are not valid (but are valid in black)

if True: pass

fmt: off

elif False: pass

@Test

fmt: off

@Test2 def Abcd(): pass

That's why I believe fmt: skip is useful

Trailing fmt: skip or fmt: off comments

A trailing fmt: skip or fmt: off comment suppresses formatting for the preceding statement, case header, decorator, function definition, or class definition:

if True: pass elif False: # fmt: skip pass

@Test @Test2 # fmt: off def test(): ...

a = [1, 2, 3, 4, 5] # fmt: off

def test(a, b, c, d, e, f) -> int: # fmt: skip pass

There are two caveats where the use of fmt: skip is not intuitive:

a; b; c # fmt: skip

def test(): pass # fmt: skip

I think this is a shortcoming that we have to accept.

Unsupported Use cases

I went through all fmt: off and fmt: skip usages to identify the uses that either require changing the suppression comments or are entirely unsupported.

Between decorators

huggingface:transformers/src/transformers/models/gpt2/modeling_gpt2.py 1516- 1517- # Initialize weights and apply final processing 1518- self.post_init() 1519- 1520- @add_start_docstrings_to_model_forward(GPT2_INPUTS_DOCSTRING) 1521: # fmt: off 1522- @add_code_sample_docstrings( 1523- checkpoint="brad1141/gpt2-finetuned-comp2", 1524- output_type=TokenClassifierOutput, 1525- config_class=_CONFIG_FOR_DOC, 1526- expected_loss=0.25,

-> Use a trailing fmt: skip

Inside Lists

752- vocab += [(piece.piece, piece.score) for piece in proto.pieces[3:]] 753- vocab += [ 754: # fmt: off 755- ('ace_Arab', 0.0), ('ace_Latn', 0.0), ('acm_Arab', 0.0), ('acq_Arab', 0.0), ('aeb_Arab', 0.0), ('afr_Latn', 0.0), ('ajp_Arab', 0.0), ('aka_Latn', 0.0), ('amh_Ethi', 0.0), ('apc_Arab', 0.0), ('arb_Arab', 0.0), ('ars_Arab', 0.0), ('ary_Arab', 0.0), ('arz_Arab', 0.0), ('asm_Beng', 0.0), ('ast_Latn', 0.0), ('awa_Deva', 0.0), ('ayr_Latn', 0.0), ('azb_Arab', 0.0), ('azj_Latn', 0.0), ('bak_Cyrl', 0.0), ('bam_Latn', 0.0), ('ban_Latn', 0.0), ('bel_Cyrl', 0.0), ('bem_Latn', 0.0), ... 756- # fmt: on 757- ]

jim22k:python-graphblas/graphblas/core/operator/unary.py 334- UINT64, 335- ] 336- if _supports_complex: 337- position_dtypes.extend([FC32, FC64]) 338- for names, *types in [ 339: # fmt: off 340- ( 341- ( 342- "erf", "erfc", "lgamma", "tgamma", "acos", "acosh", "asin", "asinh", 343- "atan", "atanh", "ceil", "cos", "cosh", "exp", "exp2", "expm1", "floor", 344- "log", "log10", "log1p", "log2", "round", "signum", "sin", "sinh", "sqrt",

jim22k:python-graphblas/graphblas/core/operator/semiring.py 409- ] 410- if _supports_complex: 411- position_dtypes.extend([FC32, FC64]) 412- notbool_dtypes.extend([FC32, FC64]) 413- for lnames, rnames, *types in [ 414: # fmt: off 415- ( 416- ("any", "max", "min", "plus", "times"), 417- ( 418- "firsti", "firsti1", "firstj", "firstj1", 419- "secondi", "secondi1", "secondj", "secondj1",

jim22k:python-graphblas/graphblas/core/operator/binary.py 743- UINT64, 744- ] 745- if _supports_complex: 746- position_dtypes.extend([FC32, FC64]) 747- name_types = [ 748: # fmt: off 749- ( 750- ("atan2", "copysign", "fmod", "hypot", "ldexp", "remainder"), 751- ((BOOL, INT8, INT16, UINT8, UINT16), FP32), 752- ((INT32, INT64, UINT32, UINT64), FP64), 753- ),

Either use fmt: skip, or assign the list to a temporary variable and suppress formatting for the assignment statement.

Inside assertions

huggingface:transformers/tests/models/mbart50/test_tokenization_mbart50.py 84- ) 85- 86- tokens = tokenizer.tokenize("I was born in 92000, and this is falsé.") 87- self.assertListEqual( 88- tokens, 89: # fmt: off 90- [SPIECE_UNDERLINE + "I", SPIECE_UNDERLINE + "was", SPIECE_UNDERLINE + "b", "or", "n", SPIECE_UNDERLINE + "in", SPIECE_UNDERLINE + "", "9", "2", "0", "0", "0", ",", SPIECE_UNDERLINE + "and", SPIECE_UNDERLINE + "this", SPIECE_UNDERLINE + "is", SPIECE_UNDERLINE + "f", "al", "s", "é", "."], 91- # fmt: on 92- ) 93- ids = tokenizer.convert_tokens_to_ids(tokens) 94- self.assertListEqual(

100- )

78- self.assertListEqual( 79- tokens, 80: # fmt: off 81- ['I', 'Ġwas', 'Ġborn', 'Ġin', 'Ġ9', '2000', ',', 'Ġand', 'Ġ', 'this', 'Ġis', 'Ġfals', 'é', '.'], 82- # fmt: on 83- )

I've seen both fmt: off comments on the statement and expression level. Disallowing fmt: off on the expression level increases consistency.

If Conditions

capyloon:gecko-b2g/third_party/python/pip/pip/_internal/commands/install.py 324- target_temp_dir_path: Optional[str] = None 325- if options.target_dir: 326- options.ignore_installed = True 327- options.target_dir = os.path.abspath(options.target_dir) 328- if ( 329: # fmt: off 330- os.path.exists(options.target_dir) and 331- not os.path.isdir(options.target_dir) 332- # fmt: on 333- ): 334- raise CommandError(

Inside dictionaries

huggingface:transformers/tests/pipelines/test_pipelines_token_classification.py 501- "word": "##zo", 502- "start": 2, 503- "end": 4, 504- }, 505- { 506: # fmt: off 507- "scores": np.array([0, 0, 0, 0.9986497163772583, 0]), 508- # fmt: on 509- "index": 7, 510- "word": "UN", 511- "is_subword": False,

557- "word": "##zo", 558- "start": 2, 559- "end": 4, 560- }, 561- { 562: # fmt: off 563- "scores": np.array([0, 0, 0, 0, 0, 0.9986497163772583, 0, 0, ]), 564- # fmt: on 565- "index": 7, 566- "word": "UN", 567- "is_subword": False,

This seems useful but won't be supported anymore.

Observation

The main use case for suppression comments is when it comes to formatting lists. We should improve list formatting by e.g. using a fill layout when all expressions in a list are simple expressions (numbers, simple strings). The fill layout tries to fit as many items as possible on a line rather than putting each item on its own line.

Open Question