msg282363 - (view) |
Author: Ivan Levkivskyi (levkivskyi) *  |
Date: 2016-12-04 21:09 |
__module__ attribute is set differently depending on whether a metaclass is explicitly called or invoked in a class statement: >>> A = ABCMeta('A', (), {}) >>> A.__module__ 'abc' >>> class B(metaclass=ABCMeta): ... ... >>> B.__module__ '__main__' Documentation on data model says that "__module__ is the module name in which the class was defined", so that the second behaviour seems right, while the first behaviour seems wrong to me. |
|
|
msg282365 - (view) |
Author: Anilyka Barry (abarry) *  |
Date: 2016-12-04 21:12 |
As a matter of fact, A.__module__ in this case is abc.ABCMeta.__module__. A class body creates a __module__ key, while a direct metaclass call does not. |
|
|
msg282367 - (view) |
Author: Ivan Levkivskyi (levkivskyi) *  |
Date: 2016-12-04 21:21 |
> As a matter of fact, A.__module__ in this case is abc.ABCMeta.__module__. A class body creates a __module__ key, while a direct metaclass call does not. But >>> A = ABCMeta('A', (), {}) >>> ABCMeta.__module__ = 'hi' >>> A.__module__ 'abc' >>> ABCMeta.__module__ 'hi' This means that the __module__ is copied from metaclass (also A.__dict__ actually contains '__module__' key, checked in 3.6). |
|
|
msg282369 - (view) |
Author: Anilyka Barry (abarry) *  |
Date: 2016-12-04 21:37 |
Oh, nicely spotted! Apparently I was wrong, and it does create a key; defaulting to __name__. About the original issue, I don't think it's easily possible to fix, sadly. |
|
|
msg282378 - (view) |
Author: Steven D'Aprano (steven.daprano) *  |
Date: 2016-12-04 23:15 |
I had a brief look at the source for ABCMeta, and it seems to me that the __module__ behaviour is coming from `type`. I'm not sure whether it can, or should, can be fixed in type, but I think that the correct behaviour for ABCMeta is to set __module__ to the caller's global "__name__", not its own. Something like this should probably work: class ABCMeta(type): def __new__(mcls, name, bases, namespace): if '__module__' not in namespace: # globals()['__name__'] gives 'abc' frame = inspect.currentframe() if frame is not None: # IronPython? caller_globals = frame.f_back.f_globals namespace['__module__'] = caller_globals['__name__'] cls = super().__new__(mcls, name, bases, namespace) ... |
|
|
msg345073 - (view) |
Author: Alejandro Gonzalez (alegonz) * |
Date: 2019-06-09 06:03 |
Hello, I would like to add further on this issue. (I'm new to bpo. Please advise if this report could be better) Issue ----- Pickling breaks when attempting to dump an instance of a class that was defined by calling ABCMeta directly, because it attempts to look for the class in `abc` rather than the module the class was defined in. Below is a short snippet to reproduce the issue (my environment is Python 3.7.3 on Ubuntu 16.04) >>> import pickle >>> from abc import ABCMeta >>> class Foo(metaclass=ABCMeta): >>> pass >>> Bar = ABCMeta('Bar', (), {}) >>> foo = Foo() >>> bar = Bar() >>> foo_p = pickle.dumps(foo) # OK >>> bar_p = pickle.dumps(bar) # PicklingError: Can't pickle <class 'abc.Bar'>: attribute lookup Bar on abc failed I encountered this issue when working on a class factory to define classes dynamically. You can work around it if you smuggle `{'__module__': __name__}` when calling the metaclass, e.g.: >>> Qux = ABCMeta('Qux', (), {'__module__': __name__}) >>> qux = Qux() >>> qux_p = pickle.dumps(qux) # OK Apparently others have also stumbled upon this ABCMeta behavior before: https://stackoverflow.com/q/49457441 Some ideas to solve this ------------------------ A. Steven's proposal seems that could work (I haven't tested it myself yet though). - This, however, would be limited to ABCMeta. Any other metaclass subclassed from type would have to include that workaround too for it to play well with pickle, which leads to, B. Do A. and include in the documentation some recipe that shows this workaround when subclassing from type. C. Implement A. in type itself. (Maybe around here? https://github.com/python/cpython/blob/master/Objects/typeobject.c#L2603) - I'm not familiar with the internals of CPython, but I guess this would be the nuclear option. If any of above ideas (or some other idea that could come up upon discussion) seems sensible, please allow me to work on it. I'd be happy to contribute :) |
|
|
msg345226 - (view) |
Author: Ivan Levkivskyi (levkivskyi) *  |
Date: 2019-06-11 11:30 |
I think we can proceed with option A, but only if doesn't cause visible slow-down for creating ABCs (which is already slower that normal classes). TBH, I don't want to document A as "official" recipe (maybe however mention the problem in the `pickle` module docs). Note that other "class factories" in stdlib (like collections.namedtuple) do almost exactly option A. |
|
|
msg345328 - (view) |
Author: Alejandro Gonzalez (alegonz) * |
Date: 2019-06-12 11:59 |
>I think we can proceed with option A, but only if doesn't cause visible slow-down for creating ABCs (which is already slower that normal classes). TBH, I don't want to document A as "official" recipe (maybe however mention the problem in the `pickle` module docs). I'll try using the same approach used in namedtuple, and see if there is any visible slow-down. Regarding the documentation, I see your point. Perhaps adding some section like "Notes on pickling dynamically-defined classes" in the `pickle` module would be more appropriate? That section would mention that you have to make sure to set the `__module__` attribute. >Note that other "class factories" in stdlib (like collections.namedtuple) do almost exactly option A. Thanks for the tip. Indeed it does. In fact, the API defines a module argument. Its documentation could be improved, though. It should mention why would you want to pass that argument (pickling seems to be the reason that argument was added in the first place). |
|
|
msg345332 - (view) |
Author: Ivan Levkivskyi (levkivskyi) *  |
Date: 2019-06-12 12:20 |
> Perhaps adding some section like "Notes on pickling dynamically-defined classes" in the `pickle` module would be more appropriate? I think just a note with few sentences would be enough. |
|
|
msg345770 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2019-06-16 19:33 |
Getting the module name from the caller's frame is an expensive operation. It is safe to set __module__ to None. In such case the pickle module is able to find the proper module containing the definition of the class. |
|
|
msg345772 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2019-06-16 19:37 |
>>> from abc import * >>> A = ABCMeta('A', (), {}) >>> A.__module__ 'abc' >>> import pickle, pickletools >>> pickletools.dis(pickletools.optimize(pickle.dumps(A))) Traceback (most recent call last): File "", line 1, in _pickle.PicklingError: Can't pickle <class 'abc.A'>: attribute lookup A on abc failed >>> A.__module__ = None >>> pickletools.dis(pickletools.optimize(pickle.dumps(A))) 0: \x80 PROTO 4 2: \x95 FRAME 15 11: \x8c SHORT_BINUNICODE '__main__' 21: \x8c SHORT_BINUNICODE 'A' 24: \x93 STACK_GLOBAL 25: . STOP highest protocol among opcodes = 4 |
|
|
msg345797 - (view) |
Author: Alejandro Gonzalez (alegonz) * |
Date: 2019-06-17 02:46 |
>Getting the module name from the caller's frame is an expensive operation. It is safe to set __module__ to None. In such case the pickle module is able to find the proper module containing the definition of the class. Wow, indeed it works. I also tried it from another module other than `__main__` and it works. Checking the source, I see the "magic" happens in pickle's `whichmodule` function when looping over `sys.modules` if `__module__` is not found. If that case, why check for `__module__` in the first place? wouldn't it be simpler to always loop over sys.modules? Is it to avoid looping over `sys.modules` whenever possible? |
|
|
msg345889 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2019-06-17 19:02 |
The problem is that __module__ is deduced from the caller's frame in type.__new__(). In case if type.__new__() is called directly, __module__ is set to the name of the module where type.__new__() is called. So virtually every type subclass which defines the __new__ method and calls type.__new__() inside it starves from the same issue. It can be fixed in general if deduce __module__ in type.__call__() instead of type.__new__(). |
|
|
msg345945 - (view) |
Author: Alejandro Gonzalez (alegonz) * |
Date: 2019-06-18 02:41 |
>It can be fixed in general if deduce __module__ in type.__call__() instead of type.__new__(). But wouldn't we have the same problem if a metaclass overrides type.__call__ instead? Also, wouldn't that reset the __module__ value anytime the class is called and an object is instantiated? |
|
|
msg348210 - (view) |
Author: Alejandro Gonzalez (alegonz) * |
Date: 2019-07-20 08:04 |
Hello, Could anyone take a look at the PR 14126 submitted? Sorry for the trouble and thank you in advance. |
|
|
msg348234 - (view) |
Author: Ivan Levkivskyi (levkivskyi) *  |
Date: 2019-07-21 09:44 |
FWIW I like Serhiy's approach more. I have never seen a single metaclass overriding type.__call__, while overriding type.__new__ is a common practice. |
|
|
msg348312 - (view) |
Author: Alejandro Gonzalez (alegonz) * |
Date: 2019-07-23 00:54 |
I see, that’s an interesting point. In that case, Serhiy’s approach is indeed better. PR 14166 seems stalled. I’d like to help if there’s anything left to do, if possible :) |
|
|
msg354071 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2019-10-07 10:18 |
There is a simpler way to fix this issue. Instead of hacking __new__, the inheritance registry can be set up in ABCMeta.__init__. There are no reasons of making this in __new__. |
|
|
msg354263 - (view) |
Author: hongweipeng (hongweipeng) * |
Date: 2019-10-09 10:38 |
I think we can refer to typing.py, it does not have this issue. >>> from typing import NamedTuple >>> A = NamedTuple('A', [('name', str), ('id', int)]) >>> class B(NamedTuple): ... name: str ... id: int ... >>> A.__module__ '__main__' >>> B.__module__ '__main__' It uses `nm_tpl.__module__ = sys._getframe(2).f_globals.get('__name__', '__main__')`. |
|
|
msg354573 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2019-10-13 11:03 |
PR 14166 now fixes the issue in more generic way. It skips one additional frame when type.__new__ is called not directly from type.__call__. |
|
|