[Python-Dev] utf8 issue (original) (raw)

Guido van Rossum guido@python.org
Thu, 05 Sep 2002 09:51:49 -0400


> Please do. Bumping MAGIC is a no-no between dot releases. But I > don't understand why that is necessary?

It would be necessary since marshal uses UTF-8 for storing Unicode literals.

Do you mean that in 2.2 it doesn't?

Even though it's highly unlikely that the problem cases are used in Python Unicode literals, there's a tiny chance. Without the MAGIC change this could result in PYC files failing to load.

Ha. You may have missed the start of this thread, but the whole problem was that a PYC file did fail to load! (The .py file had a lone surrogate in it.) So I'm not sure this argument holds much water.

Can someone please explain what change would be necessary to what part of the code to prevent a lone surrogate in a string literal from creating a PYC file from blowing up?

--Guido van Rossum (home page: http://www.python.org/~guido/)