[Python-Dev] Pickling Question (original) (raw)

Guido van Rossum guido@python.org
Sat, 09 Nov 2002 09:35:53 -0500


In my pickling article [1] I looked at various ways to handle schema evolution issues. When it came to module name and location changes, I wrote the following:

"A module name or location change is conceptually similar to a class name change but must be handled quite differently. That's because the module information is stored in the pickle but is not an attribute that can be modified through the standard pickle interface. In fact, the only way to change the module information is to perform a search and replace operation on the actual pickle file itself. Exactly how you would do this depends on your operating system and the tools you have at your disposal. And obviously this is a situation where you will want to back up your files in case you make a mistake. But the change should be fairly straightforward and will work equally well with the binary pickle format as with the text pickle format." I don't feel that this solution is entirely satisfactory and so I thought I would ask (a bit late, I know) whether I am completely correct in my assertions. If not, how else can this be handled. If so, is there any chance of adding a better way to handle this situation? [1] http://www-106.ibm.com/developerworks/library/l-pypers.html

I don't believe a search-and-replace on a pickle can ever be safe. In a binary pickle, it might interfere with length fields. And in either kind of pickle, you might accidentally replace data that happens to look like a module name.

I'd suggest something else instead: when you have a pickle referencing module A which has since been renamed to B, create a dummy module A that contains "from B import *". Then load the pickle, and write it back again. The loading should work because a reference to class A.C will find it (as an alias for B.C); the storing should store it as B.C because that's the real name of class C.

--Guido van Rossum (home page: http://www.python.org/~guido/)