[Python-Dev] PEP 540: Add a new UTF-8 mode (original) (raw)

Victor Stinner victor.stinner at gmail.com
Tue Dec 5 17:50:57 EST 2017


Chris:

I just took another look at 538 -- and yes, the relationship between the two is really unclear. In particular, with 538, why do we need 540? I honestly don't know.

The PEP 538 only impacts platforms which provide the C.UTF-8 locale or a variant: only a few recent Linux distribution. I know Fedora, maybe a few other have it? FreeBSD and macOS are completely ignored by the PEP 538. The PEP 540 uses the UTF-8 encoding for the POSIX locale on all platforms.

Moreover, the PEP 538 only concerns the POSIX locale (locale "C"), whereas the PEP 540 is usable with any locale. For example, using the "fr_FR.iso88591" locale, the encoding is Latin1. But if you enable the UTF-8 mode with this locale, Python will use UTF-8.

The other difference is that the PEP 538 is implemented with setlocale(LC_CTYPE, "C.UTF-8"), whereas the PEP 540 is implemented in Python internals and ignores the locale. The PEP 540 scope is limited to Python, non-Python running in the same process is not aware of the "Python UTF-8 mode".

Victor



More information about the Python-Dev mailing list