[Python-Dev] Unpickling py2 str as py3 bytes (and vice versa) (original) (raw)

[Python-Dev] Unpickling py2 str as py3 bytes (and vice versa) - implementation (issue #6784)

Guido van Rossum guido at python.org
Tue Mar 13 22:13:31 CET 2012


On Tue, Mar 13, 2012 at 12:42 PM, Michael Foord <fuzzyman at voidspace.org.uk> wrote:

On 13 Mar 2012, at 04:44, Merlijn van Deen wrote:

http://bugs.python.org/issue6784 ("byte/unicode pickle incompatibilities between python2 and python3")

Hello all, Currently, pickle unpickles python2 'str' objects as python3 'str' objects, where the encoding to use is passed to the Unpickler. However, there are cases where it makes more sense to unpickle a python2 'str' as python3 'bytes' - for instance when it is actually binary data, and not text. Currently, the mapping is as follows, when reading a pickle: python2 'str' -> python3 'str' (using an encoding supplied to Unpickler) python2 'unicode' -> python3 'str' or, when creating a pickle using protocol <= 2:_ _python3 'str' -> python2 'unicode' python3 'bytes' -> python2 'builtins.bytes object'

It does seem unfortunate that by default it is impossible for a developer to "do the right thing" as regards pickling / unpickling here. Binary data on Python 2 being unpickled as Unicode on Python 3 is presumably for the convenience of developers doing the wrong thing (and only works for ascii anyway).

Well, since trying to migrate data between versions using pickle is the "wrong" thing anyway, I think the status quo is just fine. Developers doing the "right" thing don't use pickle for this purpose.

-- --Guido van Rossum (python.org/~guido)



More information about the Python-Dev mailing list