Dataclass_transform: add inherit_defaults option (original) (raw)

dataclass_transform (PEP 681) is a great extension that allows type checkers to support a myriad of dataclasses-like structures, in a consistent manner without the need of plugins.

However, one topic that I could not find mentioned in either PEP 681 nor PEP 557 (dataclasses) is how to deal with overridden fields defaults.

Unfortunately not all libraries behave in the same way. On one hand we have stdlib dataclasses and pydantic dataclasses which inherit the default from the parent class. On the other hand we have attrs, attr and pydantic models which do not inherit from the parent.

This poses a challenge for type checkers as they need to decide what the correct behavior is, producing false positives and false negatives when an alternate library is used. And in fact pydantic and pyright have different behaviors. pyright requires that overriden fields must specify a default if the parent had one. While mypy assumes the default is not inherited.

For sake of brevity I will only present results for stdlib dataclasses and pydantic models, but could post the extra examples if required.

The following dataclasses code inherits the default value from the parent and does not produce any errors at runtime.

import dataclasses

@dataclasses.dataclass(frozen=True)
class Base:
  x: int = 3

@dataclasses.dataclass(frozen=True)
class Child(Base):
  x: int

print(Base())
print(Child(x=1))
print(Child())

pyright generates the false positive

test_dataclass.py:9:5 - error: "x" overrides a field of the same name but is missing a default value

mypy also generates a false positive

test_dataclass.py:13: error: Missing positional argument "x" in call to "Child" [call-arg]

the following pydantic model code

import pydantic

class Base(pydantic.BaseModel, frozen=True):
  x: int = 3

class Child(Base, frozen=True):
  x: int

print(Base())
print(Child(x=1))
print(Child())

raises an error at runtime because of the missing x argument in the last call

and pyright generates a false positive about the overriden field and a false negative about the last call to Child

test_pydantic_model.py:7:5 - error: "x" overrides a field of the same name but is missing a default value (reportGeneralTypeIssues)

while mypy correctly generates the error

test_pydantic_model.py:10: error: Missing named argument "x" for "Child" [call-arg]

One alternative to support both design choices is to extend PEP 681 with a new attribute called inherit_defaults, that when set to True would mean that defaults from the parents are inherited. In that case type checkers will need to also check that the inherited default is compatible with the overriden type. For example, if the original field was x: int | None = None, overriding it in a subclass as just x: int should produce an error as None is not compatible with int.

Another option would be to extend? some other PEP (484 or 557) and mandate that typecheckers require overriden fields with defaults to also specify one. This is what is currently implemented by pyright (motivated by this issue), which is neither equivalent to the hypotheticals inherit_defaults=True, nor inherits_defaults=False. This take is currently the safest, as forcing the default definitions ensures this works for all dataclasses alternatives. However, this is limiting for authors, as once a field has a default, there is no way to remove it.