[Python-Dev] undesireable unpickle behavior, proposed fix (original) (raw)
Guido van Rossum guido at python.org
Tue Jan 27 19:57:22 CET 2009
- Previous message: [Python-Dev] undesireable unpickle behavior, proposed fix
- Next message: [Python-Dev] undesireable unpickle behavior, proposed fix
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Tue, Jan 27, 2009 at 10:43 AM, "Martin v. Löwis" <martin at v.loewis.de> wrote:
Interning the strings on unpickling makes the pickles smaller, and at least for cPickle actually makes unpickling sequences of many objects slightly faster. I have included proposed patches to cPickle.c and pickle.py, and would appreciate any feedback. Please submit patches always to the bug tracker. On the proposed change: While it is fairly unintrusive, I would like to propose a different approach - pickle interned strings special. The marshal module already uses this approach, and it should extend to pickle (although it would probably require a new protocol). On pickling, inspect each string and check whether it is interned. If so, emit a different code, and record it into the object id dictionary. On a second occurrence of the string, only pickle a backward reference. (Alternatively, check whether pickling the same string a second time would be more compact). On unpickling, support the new code to intern the result strings; subsequent references to it will go to the standard backreferencing algorithm.
Hm. This would change the pickling format though. Wouldn't just interning (short) strings on unpickling be simpler?
-- --Guido van Rossum (home page: http://www.python.org/~guido/)
- Previous message: [Python-Dev] undesireable unpickle behavior, proposed fix
- Next message: [Python-Dev] undesireable unpickle behavior, proposed fix
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]