Revisiting attribute docstrings (original) (raw)
October 17, 2023, 2:50am 1
PEP 224 (Attribute Docstrings) proposed a syntax for class attribute docstrings:
class A:
b = 42
"""Some documentation."""
c = None
This was rejected because of ambiguity for readers about which attribute the docstring referred to.
With the prevalence of Sphinx, it is now understood that the docstring refers to the immediate prior symbol (see the docs).
Some in the community don’t like the approach introduced in PEP 727 (Documentation in Annotated Metadata), where a symbol’s documentation is a field in its Annotated
annotation, and wish to introduce more Pythonic syntax to address the problems raised in that PEP. Below is a proposal which does just that.
I propose an extended form of PEP 224 to document symbols: module attributes, class attributes, and function parameters.
module_attribute = "spam"
"""A string."""
class AClass:
class_attribute = 42
"""An integer."""
def foo(
bar,
"""A required parameter."""
baz: int = None,
"""An optional parameter."""
):
...
A docstring will always document the immediately-prior symbol at the same indentation level. Note how the comma ,
following a parameter definition must go before the docstring to prevent string-concatenation with string-typed parameter defaults.
The docstring’s value will be stored in a new attribute __docstrings__
, defined only on usage of this proposal (injected into parent’s __dict__
after the parent is defined and processed). For module and class attributes, __docstrings__
is set on the module and class respectively. For function parameters, __docstrings__
is set on the function. [1]
I’ve found one usage on GitHub already using the name __docstrings__
inspect.Parameter
would gain a new instance attribute docstring
which has the parameter’s corresponding docstring value. A search shows some potential conflict with existing code.
- perhaps
type
,ModuleType
andFunctionType
could learn a__docstrings__
getter-property; this is an implementation detail ↩︎
NeilGirdhar (Neil Girdhar) October 17, 2023, 4:47am 2
I’d definitely use this if it were accepted since I prefer having docstrings closer to the variable, and I dislike repeating parameter names.
A few questions:
- Is there any way to put the docstring on the same line as the variable or parameter? That ability alone makes comments a close competitor.
- Are you making any recommendation about where the comma should with parameter docstrings (your example shows two places)?
- If we have programmatic access to parameter docstrings, will they available in
inspect.Parameter
?
csm10495 (Charles Machalow) October 17, 2023, 5:14am 3
I personally can’t stand when the docstring is below a non-function-like variable. Especially a lot of the time these would be one line comments. I’d say either do them on the same line or above.
EpicWink (Laurie O) October 17, 2023, 5:21am 4
I can’t think of a good solution which looks legible, so I’d prefer to stick to the existing forms with class and module attributes.
I don’t mind what the final solution is, but the proposal in my original post explicitly allows for either
Yes, I’ll update the original post to make a comment on Parameter
The doc-string is parsed by Sphinx (and other tools, like PyCharm (an IDE)) and used as the documentation for that symbol, which a comment can’t currently (nor do I think should) do. This proposal would further make that available at runtime
csm10495 (Charles Machalow) October 17, 2023, 5:25am 5
I understand but I personally prefer the other way: have the docstring above the definition.
Even if they can be parsed by tooling, I have to look at it, and don’t like the way docstring below variable looks.
I don’t think I’ve ever seen docstring below variable besides class/function declaration.
If we decide to formalize a format, if prefer it be above or on the same line.
Edit:
I think I’ve seen tools parse
V = 'hello' #: this is the docstring for V
As a way of doing one liner
Similarly also for two lines:
#: this is the docstring for V
V = 'hello'
EpicWink (Laurie O) October 17, 2023, 5:30am 6
You’re probably referring to Sphinx’s autoattribute, which also supports docstrings after the attribute
BrenBarn (Brendan Barnwell) October 17, 2023, 6:23am 7
I’d consider that a showstopper. The comma is an explicit separator. Having a docstring after a comma grouped with a parameter before the comma seems red-alert ultra-confusing to me.
barry (Barry Warsaw) October 18, 2023, 12:22am 8
Same, for parity for where I would normally write the comment. I’ve never seen comment-below ever referring to the line above.
NeilGirdhar (Neil Girdhar) October 18, 2023, 2:03am 9
Would it be worth contrasting this PEP’s proposed notation with the #:
notation in the PEP?
fonini (Pedro Fonini) October 18, 2023, 2:27am 10
Placing the comma after the docstring introduces a syntax ambiguity:
def foo(
param: str = "some default value"
"""Some documentation""",
):
...
Is that a string concatenation? Or is the second one a docstring? With today’s syntax, it’s a string concatenation, so if it’s now a docstring then that’s a backwards-incompatible syntax change.
On the other hand, I would agree with @BrenBarn that separating the parameter from its docstring with a comma would be egregious.
EpicWink (Laurie O) October 18, 2023, 8:29am 11
No PEP yet, but I likely will if and when I make one. The original post is simply motivation and syntax.
From GitHub searches:
- symbol documentation using special comments
#:
is in 33 900 files (search) - symbol documentation using multi-line strings
"""
is in 68 200 files (search) (although this has an indeterminate number of false-positives)
I’m personally not a fan of the #:
syntax as I consider all parts of the code starting with hash #
to be stripped from the runtime (but potentially used by some tools: the most popular ones I can think of are black and coverage). I also prefer the consistency of having docstrings after declarations (ie the case right now with functions, classes and modules).
That’s a problem; I don’t think we should change string-concatenation semantics. I’ll update the original post to remove that option.
Perhaps there’s opportunity to add a delimiter, eg semicolon ;
.
tmk (Thomas Kehrenberg) October 18, 2023, 8:46am 12
Would it be stored as a dictionary with the attribute/parameter name as key and the doc string as value? Or what did you have in mind?
NeilGirdhar (Neil Girdhar) October 18, 2023, 11:30am 13
A delimiter would be ideal since it would allow you to put the docstring on the same line for parameters and variables. E.g.,
bias: float = 4; "The bias of the model"
weights: list[float]
"""The weights of model.
Initialized to zero.
"""
and similarly for parameters (with a comma after the docstring). Is that what you have in mind?
So, essentially a parameter x
has an optional type annotation indicated by :
, an optional default indicated by =
, and an optional docstring indicated by (perhaps) ;
?
davidism (David Lord) October 18, 2023, 2:30pm 14
In Flask and my other projects, I’m moving away from #:
above to """
below attributes to document them.
#:
has the problem of messing with indentation levels, since it’s 3 characters before you start typing, so if you need to indent further for some reason, you have to manually add an extra few spaces after pressing tab to get things properly indented. #:
also isn’t handled by IDEs well, so typing that before every line of a multiline doc is tedious.
"""
avoids those two issues, and also matches what documentation looks like and where it’s found for classes and functions already. It’s also easier to modify lines and reflow text later.
I guess this sort of defeats the purpose of this discussion as opposed to the docs-in-annotations PEP, but I’d personally leave out parameters from a proposal. Aside from the ambiguity in can create, I still think parameter documentation looks better in the class/function docstring rather than next to each parameter.
barry (Barry Warsaw) October 18, 2023, 3:27pm 15
Clearly we need TOCs - triple octothorp comments.
EpicWink (Laurie O) October 18, 2023, 9:23pm 16
Yes, specifically a mapping from strings to strings (don’t need to require all dict methods).
That’s a good option, and was something I was thinking about. I’d imagine many aren’t comfortable with slightly altering the definition of the semicolon ;
though.
Parameter documentation is my entire motivation: the module and class attribute was just a bonus.
I agree that parameter docs look better in the docstring, there are situations where the docs are better suited near the parameter and unambiguous, for example when needed at runtime.
davidism (David Lord) October 18, 2023, 9:57pm 17
My main point was a preference towards """
over #:
style (or both, as Sphinx does now). If you’re confident about parameter docstrings as well, that’s fine.
pawamoy (Timothée Mazzucotelli) October 20, 2023, 1:55pm 18
Having docstrings above attributes creates an ambiguity with module docstrings:
# ambiguous.py
"""Am I the module docstring, or the docstring of `hello`?"""
hello = "hello"
apalala (Juancarlo Añez) October 20, 2023, 7:39pm 19
We could compromise on syntax that goes in the same line as the attribute. It would also work for function arguments. I didn’t know about #:
, and I like it, except that it could interfere with comments aimed at guiding commonly-used linters.
csm10495 (Charles Machalow) October 20, 2023, 11:06pm 20
Maybe ambiguous to a computer, but to a human the extra newline tells us.