[Python-Dev] PEP 435: pickling enums created with the functional API (original) (raw)
Nick Coghlan ncoghlan at gmail.com
Tue May 7 17:03:38 CEST 2013
- Previous message: [Python-Dev] PEP 435: pickling enums created with the functional API
- Next message: [Python-Dev] PEP 435: pickling enums created with the functional API
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Tue, May 7, 2013 at 11:34 PM, Eli Bendersky <eliben at gmail.com> wrote:
One of the contended issues with PEP 435 on which Guido pronounced was the functional API, that allows created enumerations dynamically in a manner similar to namedtuple:
Color = Enum('Color', 'red blue green') The biggest complaint reported against this API is interaction with pickle. As promised, I want to discuss here how we're going to address this concern. At this point, the pickle docs say that module-top-level classes can be pickled. This obviously works for the normal Enum classes, but is a problem with the functional API because the class is created dynamically and has no module. To solve this, the reference implementation is used the same approach as namedtuple (*). In the metaclass's new (this is an excerpt, the real code has some safeguards): modulename = sys.getframe(1).fglobals['name'] enumclass.module = modulename According to an earlier discussion, this is works on CPython, PyPy and Jython, but not on IronPython. The alternative that works everywhere is to define the Enum like this: Color = Enum('themodule.Color', 'red blue green') The reference implementation supports this as well. Some points for discussion: 1) We can say that using the functional API when pickling can happen is not recommended, but maybe a better way would be to just explain the way things are and let users decide?
It's probably worth creating a section in the pickle docs and explaining the vagaries of naming things and the dependency on knowing the module name. The issue comes up with defining classes in main and when implementing pseudo-modules as well (see PEP 395).
2) namedtuple should also support the fully qualified name syntax. If this is agreed upon, I can create an issue.
Yes, I think that part should be done.
3) Antoine mentioned that work is being done in 3.4 to enable pickling of nested classes (http://www.python.org/dev/peps/pep-3154/). If that gets implemented, I don't see a reason why Enum and namedtuple can't be adjusted to find the qualname of the class they're internal to. Am I missing something?
The class based form should still work (assuming only classes are involved), the stack inspection will likely fail.
4) Using getframe(N) here seems like an overkill to me.
It's not just overkill, it's fragile - it only works if you call the constructor directly. If you use a convenience function in a utility module, it will try to load your pickles from there rather than wherever you bound the name.
What we really need is just the module in which the current execution currently is (i.e. the metaclass's new in our case). Would it make sense to add a new function somewhere in the stdlib of 3.4 (in sys or inspect or ...) that just provides the current module name? It seems that all Pythons should be able to easily provide it, it's certainly a very small subset of the functionality provided by walking the callframe stack. This function can then be used for build fully qualified names for pickling of Enum and namedtuple. Moreover, it can be general even more widely - dynamic class building is quite common in Python code, and as Nick mentioned somewhere earlier, the extra power of metaclasses in the recent 3.x's will probably make it even more common.
Yes, I've been thinking along these lines myself, although in a slightly more expanded form that also touches on the issues that stalled PEP 406 (the import engine API that tries to better encapsulate the import state). It may also potentially address some issues with initialisation of C extensions (I don't remember the exact details off the top of my head, but there's some info we want to get from the import machinery to modules initialised from Cython, but the loader API and the C module initialisation API both get in the way).
Specifically, what I'm talking about is some kind of implicit context similar to the approach the decimal module uses to control operations on Decimal instances. In this case, what we're trying to track is the "active module", either main (if the code has been triggered directly through an operation in that module), or else the module currently being imported (if the import machinery has been invoked).
The bare minimum would just be to store the name (using sys.modules to get access to the full module if needed) in a way that adequately handles nested, circular and threaded imports, but there may be a case for tracking a richer ModuleContext object instead.
However, there's also a separate question of whether implicitly tracking the active module is really what we want. Do we want that, or is what we actually want the ability to define an arbitrary "naming context" in order to use functional APIs to construct classes without losing the pickle integration of class statements?
What if there was a variant of the class statement that bound the result of a function call rather than using the normal syntax:
class Animal from enum.Enum(members="dog cat bear")
And it was only class statements in that form which manipulated the naming context? (you could also use the def keyword rather than class)
Either form would essentially be an ordinary assignment statement, except that they would manipulate the naming context to record the name being bound and relevant details of the active module.
Regardless, I think the question is not really well enough defined to be a topic for python-dev, even though it came up in a python-dev discussion - it's more python-ideas territory.
Cheers, Nick.
-- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia
- Previous message: [Python-Dev] PEP 435: pickling enums created with the functional API
- Next message: [Python-Dev] PEP 435: pickling enums created with the functional API
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]