Issue 35540: dataclasses.asdict breaks with defaultdict fields (original) (raw)

Issue35540

Created on 2018-12-19 22:06 by wrmsr, last changed 2022-04-11 14:59 by admin.

Pull Requests
URL Status Linked Edit
PR 11361 open remi.lapeyre,2018-12-30 00:55
PR 11361 open remi.lapeyre,2018-12-30 00:55
PR 11361 open remi.lapeyre,2018-12-30 00:55
PR 16356 open p-ganssle,2019-09-24 14:17
PR 32056 open kwsp,2022-03-22 18:28
Messages (6)
msg332166 - (view) Author: Will T (wrmsr) Date: 2018-12-19 22:06
_asdict_inner attempts to manually recursively deepcopy dicts by calling type(obj) with a generator of transformed keyvalue tuples @ https://github.com/python/cpython/blob/b2f642ccd2f65d2f3bf77bbaa103dd2bc2733734/Lib/dataclasses.py#L1080 . defaultdicts are dicts so this runs but unlike other dicts their first arg has to be a callable or None: import collections import dataclasses as dc @dc.dataclass() class C: d: dict c = C(collections.defaultdict(lambda: 3, {})) d = dc.asdict(c) assert isinstance(d['d'], collections.defaultdict) assert d['d']['a'] == 3 => Traceback (most recent call last): File "boom.py", line 9, in d = dc.asdict(c) File "/Users/spinlock/.pyenv/versions/3.7.1/lib/python3.7/dataclasses.py", line 1019, in asdict return _asdict_inner(obj, dict_factory) File "/Users/spinlock/.pyenv/versions/3.7.1/lib/python3.7/dataclasses.py", line 1026, in _asdict_inner value = _asdict_inner(getattr(obj, f.name), dict_factory) File "/Users/spinlock/.pyenv/versions/3.7.1/lib/python3.7/dataclasses.py", line 1058, in _asdict_inner for k, v in obj.items()) TypeError: first argument must be callable or None I understand that it isn't this bit of code's job to support every dict (and list etc.) subclass under the sun but given defaultdict is stdlib it's imo worth supporting explicitly.
msg332741 - (view) Author: Rémi Lapeyre (remi.lapeyre) * Date: 2018-12-30 01:25
Hi @wrmsr, this happens because the constructor for `collections.defaultdict` differs from the one of `dict`. I think the argument that collections.defaultdict is in the stdlib and should be supported is right, the changes in PR #11361 should do what you want.
msg353092 - (view) Author: Paul Ganssle (p-ganssle) * (Python committer) Date: 2019-09-24 14:25
Considering that `namedtuple` is special-cased, I think it's reasonable to special-case `defaultdict` as well, though it may be worth considering more general solutions that will also work for things other than the standard library. One would be to solve this the same way that other "subclasses may have a different constructor" problems are solved (e.g. `float`, `int`, formerly `datetime`) and ignore the subclass (or selectively ignore it if it's a problem), for example changing _asdict_inner to something like this: if isinstance(obj, dict): new_keys = tuple((_asdict_inner(k, dict_factory), _asdict_inner(v, dict_factory)) for k, v in obj.items()) try: return type(obj)(new_keys) except Exception: return dict(new_keys) Another more general alternative would be to add a type registry for `asdict`, either as an additional parameter or with a new transformer class of some sort. I created a quick proof of concept for this in GH-16356 to see one way it could look. In any case I think it's quite unfortunate that we can't easily just support anything that has a __deepcopy__ defined. There may be some crazy solution that involves passing a class with a custom __getitem__ to the `memo` argument of copy.deepcopy, but if it's even possible (haven't thought about it enough) I'm not sure it's *advisable*.
msg353116 - (view) Author: Paul Ganssle (p-ganssle) * (Python committer) Date: 2019-09-24 19:48
I checked and it appears that `attrs` handles this by creating *all* dicts using the default dict_factory (similar to my original suggestion of just using `dict` instead of the specific type), if I'm reading this right: https://github.com/python-attrs/attrs/blob/master/src/attr/_funcs.py#L102 Using `attr.asdict` seems to bear this out, as `defaultdict` attributes are converted to `dict` when the dict factory is not specified. I think changing the default behavior like that would be a backwards-incompatible change at this point (and one that it's really hard to warn about, unfortunately), but we could still use the "fall back to dict_factory" behavior by trying to construct a `type(obj)(...)` and in the case of an exception return `dict_factory(...)`.
msg390850 - (view) Author: Alexandru Coca (alexcoca) Date: 2021-04-12 13:50
I was wondering if this issue is still being tracked for resolution? I found the same bug in Python 3.8.
msg405243 - (view) Author: Raymond Xu (greenfish6) Date: 2021-10-28 19:52
I am seeing this bug with 3.9.7
History
Date User Action Args
2022-04-11 14:59:09 admin set github: 79721
2022-03-22 18:28:58 kwsp set nosy: + kwsppull_requests: + <pull%5Frequest30148>
2021-10-28 19:52:01 greenfish6 set nosy: + greenfish6messages: +
2021-04-12 13:50:36 alexcoca set nosy: + alexcocamessages: +
2019-09-24 19:48:27 p-ganssle set keywords:patch, patch, patchmessages: +
2019-09-24 14:25:11 p-ganssle set keywords:patch, patch, patchnosy: + p-gansslemessages: +
2019-09-24 14:17:37 p-ganssle set pull_requests: + <pull%5Frequest15935>
2018-12-30 01:25:41 remi.lapeyre set nosy: + remi.lapeyremessages: +
2018-12-30 00:55:46 remi.lapeyre set keywords: + patchstage: patch reviewpull_requests: + <pull%5Frequest10684>
2018-12-30 00:55:38 remi.lapeyre set keywords: + patchstage: (no value)pull_requests: + <pull%5Frequest10683>
2018-12-30 00:55:30 remi.lapeyre set keywords: + patchstage: (no value)pull_requests: + <pull%5Frequest10682>
2018-12-21 17:56:37 levkivskyi set nosy: + levkivskyi
2018-12-19 22:32:30 eric.smith set assignee: eric.smithnosy: + eric.smith
2018-12-19 22:06:44 wrmsr create