spec: clarify interaction of Final and dataclass by carljm · Pull Request #1669 · python/typing (original) (raw)
See discussion thread at https://discuss.python.org/t/treatment-of-final-attributes-in-dataclass-likes/47154 and issue at python/cpython#89547
Consider a dataclass with a Final-annotated initialized assignment in the class body:
@dataclass
class C:
x: Final[int] = 3
This is currently under-specified in the typing spec. Neither PEP 591 or PEP 681 clearly specified this composition.
There are two possible consistent interpretations:
- It could be a class-only variable (implicit
ClassVar
)C.x
with final value3
. In this interpretation, it should not be a dataclass field, because dataclass fields are inherently set on instances, they cannot be ClassVar. This is the interpretation suggested by PEP 591 (and thus also the current typing spec for Final), which say that Final with an assigned value in a class body is always implicitly a ClassVar. - It could be a dataclass field
x
with default value3
, which cannot be reassigned on an instance after it is initialized in the generated__init__
method.
This pull request specifies option 2.
I will also provide an update to the conformance suite for this, if the Typing Council accepts the spec change.
Current runtime behavior
The stdlib dataclasses module considers x
in this example to be a dataclass field.
If the assigned value is a default value (as in the example above), dataclasses leaves that default value in place as a class attribute (so x
is a class-and-instance variable, and C.x == 3
.) If the assigned value is a field(...)
call, dataclasses does not leave any attribute x
on the class at all, so x
is purely a dataclass field / instance variable.
Current type-checker behavior
Playground links:
All three agree with the runtime behavior of dataclasses, considering x
to be a dataclass field which is included in the dataclass __init__
method and set on instances in __init__
. All three prohibit further assignments to x
on instances, noting that it is a final attribute. Pyright also mentions that it is a ClassVar, which is confusing given that "ClassVar" and "dataclass field" are (or should be) mutually exclusive.
All three assume that C.x
is available and of type int
, whether we have x: Final[int] = 3
or x: Final[int] = field(default_value=3)
. That is, none of the type checkers model the actual runtime behavior that field()
objects are removed from the class and not replaced with anything. (Arguably this behavior is just a bug/inconsistency in the dataclasses runtime implementation.)
Considerations
- Option 2 (proposed in this PR) is clearly less disruptive to existing code, as it effectively just specifies the de facto current behavior of the runtime and all type checkers. The only practical change to type checker behavior in the given examples that would occur as a result of this spec change is that Pyright should not claim in its error message that
x
is a ClassVar, but otherwise behavior would remain as it is; errors would still be raised on exactly the same lines by all three type checkers. - Option 2 has the downside that one cannot assume as an invariant that the syntax
x: Final[int] = 3
always means thatx
will always have value exactly3
. In dataclass bodies this will be more nuanced: the class attribute will always be3
, but it will be shadowed by an instance attribute that may have a different value on any given instance. On the other hand, it is already the case that the semantics of annotated assignments in dataclass bodies differ from other contexts:x: int = field(...)
whenfield(...)
does not returnint
would obviously be a type error anywhere else. This example illustrates that in dataclasses, the annotation generally does not apply to the immediate assignment, but to the type of the dataclass field (instance var) specified by the annotated assignment. - If we select Option 1, it will be quite cumbersome to achieve a Final dataclass field (one would have to write out the
__init__
method manually instead of allowing dataclass to generate it, eliminating a key benefit of dataclasses.) If we select Option 2, it remains quite easy to have a final classvar on a dataclass, using aClassVar[Final[...]]
annotation.