[Python-Dev] On a new version of pickle [PEP 3154]: self-referential frozensets (original) (raw)
Alexandre Vassalotti alexandre at peadrop.com
Wed Jun 27 23:12:48 CEST 2012
- Previous message: [Python-Dev] On a new version of pickle [PEP 3154]: self-referential frozensets
- Next message: [Python-Dev] 3.3 release plans
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Sat, Jun 23, 2012 at 3:19 AM, M Stefan <mstefanro at gmail.com> wrote:
* UNIONFROZENSET: like UPDATESET, but create a new frozenset stack before: ... pyfrozenset mark stackslice stack after : ... pyfrozenset.union(stackslice)
Since frozenset are immutable, could you explain how adding the UNION_FROZENSET opcode helps in pickling self-referential frozensets? Or are you only adding this one to follow the current style used for pickling dicts and lists in protocols 1 and onward?
While this design allows pickling of self-referenti/Eal sets, self-referential frozensets are still problematic. For instance, trying to pickle `fs': a=A(); fs=frozenset([a]); a.fs = fs (when unpickling, the object a has to be initialized before it is added to the frozenset)
The only way I can think of to make this work is to postpone the initialization of all the objects inside the frozenset until after UNIONFROZENSET. I believe this is doable, but there might be memory penalties if the approach is to simply store all the initialization opcodes in memory until pickling the frozenset is finished. I don't think that's the only way. You could also emit POP opcode to discard the frozenset from stack and then emit a GET to fetch it back from the memo. This is how we currently handle self-referential tuples. Check out the save_tuple method in pickle.py to see how it is done. Personally, I would prefer that approach because it already well-tested and proven to work.
That said, your approach sounds good too. The memory trade-off could lead to smaller pickles and more efficient decoding (though these self-referential objects are rare enough that I don't think that any improvements there would matter much).
While self-referential frozensets are uncommon, a far more problematic
situation is with the self-referential objects created with REDUCE. While pickle uses the idea of creating empty collections and then filling them, reduce tipically creates already-filled objects. For instance: cnt = collections.Counter(); cnt[a]=3; a.cnt=cnt; cnt.reduce() (<class 'collections.Counter'>, ({<_main_.A object at 0x0286E8F8>: 3},)) where the A object contains a reference to the counter. Unpickling an object pickled with this reduce function is not possible, because the reduce function, which "explains" how to create the object, is asking for the object to exist before being created.
Your example seems to work on Python 3. I am not sure if I follow what you are trying to say. Can you provide a working example?
$ python3 Python 3.1.2 (r312:79147, Dec 9 2011, 20:47:34) [GCC 4.4.3] on linux2 Type "help", "copyright", "credits" or "license" for more information.
import pickle, collections c = collections.Counter() class A: pass ... a = A() c[a] = 3 a.cnt = c b =pickle.loads(pickle.dumps(a)) b in b.cnt True
Pickle could try to fix this by detecting when reduce returns a class type as the first tuple arg and move the dict ctor parameter to the state, but this may not always be intended. It's also a bit strange that getstate is never used anywhere in pickle directly.
I would advise against any such change. The reduce protocol is already fairly complex. Further I don't think change it this way would give us any extra flexibility.
The documentation has a good explanation of how getstate works under hood: http://docs.python.org/py3k/library/pickle.html#pickling-class-instances
And if you need more, PEP 307 (http://www.python.org/dev/peps/pep-0307/) provides some of the design rationales of the API. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20120627/8210b61b/attachment.html>
- Previous message: [Python-Dev] On a new version of pickle [PEP 3154]: self-referential frozensets
- Next message: [Python-Dev] 3.3 release plans
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]