[Python-Dev] PEP 435: pickling enums created with the functional API (original) (raw)
Eli Bendersky eliben at gmail.com
Tue May 7 17:44:46 CEST 2013
- Previous message: [Python-Dev] PEP 435: pickling enums created with the functional API
- Next message: [Python-Dev] PEP 435: pickling enums created with the functional API
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Tue, May 7, 2013 at 8:03 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
On Tue, May 7, 2013 at 11:34 PM, Eli Bendersky <eliben at gmail.com> wrote: > One of the contended issues with PEP 435 on which Guido pronounced was the > functional API, that allows created enumerations dynamically in a manner > similar to namedtuple: > > Color = Enum('Color', 'red blue green') > > The biggest complaint reported against this API is interaction with pickle. > As promised, I want to discuss here how we're going to address this concern. > > At this point, the pickle docs say that module-top-level classes can be > pickled. This obviously works for the normal Enum classes, but is a problem > with the functional API because the class is created dynamically and has no > module. > > To solve this, the reference implementation is used the same approach as > namedtuple (*). In the metaclass's new (this is an excerpt, the real > code has some safeguards): > > modulename = sys.getframe(1).fglobals['name'] > enumclass.module = modulename > > According to an earlier discussion, this is works on CPython, PyPy and > Jython, but not on IronPython. The alternative that works everywhere is to > define the Enum like this: > > Color = Enum('themodule.Color', 'red blue green') > > The reference implementation supports this as well. > > Some points for discussion: > > 1) We can say that using the functional API when pickling can happen is not > recommended, but maybe a better way would be to just explain the way things > are and let users decide?
It's probably worth creating a section in the pickle docs and explaining the vagaries of naming things and the dependency on knowing the module name. The issue comes up with defining classes in main and when implementing pseudo-modules as well (see PEP 395). Any pickle-expert volunteers to do this? I guess we can start by creating a documentation issue.
> 2) namedtuple should also support the fully qualified name syntax. If this > is agreed upon, I can create an issue.
Yes, I think that part should be done.
OK, I'll create an issue.
> 3) Antoine mentioned that work is being done in 3.4 to enable pickling of > nested classes (http://www.python.org/dev/peps/pep-3154/). If that gets > implemented, I don't see a reason why Enum and namedtuple can't be adjusted > to find the qualname of the class they're internal to. Am I missing > something? The class based form should still work (assuming only classes are involved), the stack inspection will likely fail.
I can probably be made to work with a bit more effort than the current "hack", but I don't see why it wouldn't be doable.
> 4) Using getframe(N) here seems like an overkill to me.
It's not just overkill, it's fragile - it only works if you call the constructor directly. If you use a convenience function in a utility module, it will try to load your pickles from there rather than wherever you bound the name.
In theory you can climb the frame stack until the desired place, but this is specifically what my proposal of adding a function tries to avoid.
> What we really need > is just the module in which the current execution currently is (i.e. the > metaclass's new in our case). Would it make sense to add a new function > somewhere in the stdlib of 3.4 (in sys or inspect or ...) that just provides > the current module name? It seems that all Pythons should be able to easily > provide it, it's certainly a very small subset of the functionality provided > by walking the callframe stack. This function can then be used for build > fully qualified names for pickling of Enum and namedtuple. Moreover, it can > be general even more widely - dynamic class building is quite common in > Python code, and as Nick mentioned somewhere earlier, the extra power of > metaclasses in the recent 3.x's will probably make it even more common. Yes, I've been thinking along these lines myself, although in a slightly more expanded form that also touches on the issues that stalled PEP 406 (the import engine API that tries to better encapsulate the import state). It may also potentially address some issues with initialisation of C extensions (I don't remember the exact details off the top of my head, but there's some info we want to get from the import machinery to modules initialised from Cython, but the loader API and the C module initialisation API both get in the way). Specifically, what I'm talking about is some kind of implicit context similar to the approach the decimal module uses to control operations on Decimal instances. In this case, what we're trying to track is the "active module", either main (if the code has been triggered directly through an operation in that module), or else the module currently being imported (if the import machinery has been invoked). The bare minimum would just be to store the name (using sys.modules to get access to the full module if needed) in a way that adequately handles nested, circular and threaded imports, but there may be a case for tracking a richer ModuleContext object instead. However, there's also a separate question of whether implicitly tracking the active module is really what we want. Do we want that, or is what we actually want the ability to define an arbitrary "naming context" in order to use functional APIs to construct classes without losing the pickle integration of class statements? What if there was a variant of the class statement that bound the result of a function call rather than using the normal syntax: class Animal from enum.Enum(members="dog cat bear") And it was only class statements in that form which manipulated the naming context? (you could also use the def keyword rather than class) Either form would essentially be an ordinary assignment statement, except that they would manipulate the naming context to record the name being bound and relevant details of the active module. Regardless, I think the question is not really well enough defined to be a topic for python-dev, even though it came up in a python-dev discussion - it's more python-ideas territory.
Wait... I agree that having a special syntax for this is a novel idea that's not well defined and can be discussed on python-ideas. But the utility function I was mentioning is a pretty simple idea, and it's well defined. It can be very useful in contexts where code is created dynamically, by removing the amount of explicit-frame-walking hacks.
Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20130507/b1f481a5/attachment.html>
- Previous message: [Python-Dev] PEP 435: pickling enums created with the functional API
- Next message: [Python-Dev] PEP 435: pickling enums created with the functional API
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]