[Python-Dev] Python 3.5 now uses surrogateescape for the POSIX locale (original) (raw)

Nick Coghlan ncoghlan at gmail.com
Tue Mar 18 10:48:38 CET 2014


On 18 March 2014 19:13, Victor Stinner <victor.stinner at gmail.com> wrote:

2014-03-18 9:08 GMT+01:00 Nick Coghlan <ncoghlan at gmail.com>:

On 18 Mar 2014 11:56, "Victor Stinner" <victor.stinner at gmail.com> wrote:

Hi, I modified Python 3.5 to use the "surrogateescape" error handler (PEP 383) for stdin and stdout when the LCCTYPE locale is POSIX ("C" locale): http://bugs.python.org/issue19977 Yay, thanks Victor. I'll let the Fedora folks know this has been merged, as we may seriously consider applying this as a vendor patch to our build of Python 3.4 (while I agree this isn't a bug fix, the current behaviour also poses a problem for Fedora as more core utilities start migrating to Python 3). Please don't cherry-pick this change in Fedora if it is not done in Python 3.4. It changes the behaviour of Python and I would prefer to have the same behaviour on the same Python version on all platforms. I'm not against backporting the change in Python 3.4.1. It can be seen as a bugfix. I don't think that anyone wants a Unicode error when reading or printing non-ASCII data from stdin/to stdout. But I would like the opinion of other developers before doing that.

Well, the concern has always been the risk of silently generating bad data if there is a mismatch between the OS encoding and the stream encodings. That's why it took so long to make this change at all - we had to figure out that the underlying problem was really the ease with which even a properly configured Linux systems could end up running Python 3 code in the POSIX locale, and thus end up with improperly configured standard streams. Enabling "surrogateescape" by default only when the standard stream encoding is "ascii" helps to mitigate that risk, while still dealing with the main problem. I meant to try to get this into 3.4 (since a couple of the Fedora folks convinced me it was a problem), but there are only so many hours in the day, and it took me quite a while to fully grasp the actual problem.

If folks are open to backporting this change to 3.4.1, then yes, I'd definitely prefer an upstream solution. Otherwise, it will be up to the Fedora Python maintainers to decide what they want to do.

Cheers, Nick.

-- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia



More information about the Python-Dev mailing list