[Python-Dev] Inconsistencies if locale and filesystem encodings are different (original) (raw)
Victor Stinner victor.stinner at haypocalc.com
Thu Oct 7 18:19:41 CEST 2010
- Previous message: [Python-Dev] Patch making the current email package (mostly) support bytes
- Next message: [Python-Dev] Inconsistencies if locale and filesystem encodings are different
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi,
A PYTHONFSENCODING environment variable was added to Python 3.2: issue #8622. This variable introduces an inconstency because the filesystem and the locale encodings can now be different.
There are (at least) four issues related to this problem. We have 2 choices to fix these issues:
(a) use the same encoding to encode and decode values (it can be different for each issue)
(b) remove PYTHONFSENCODING variable and raise an error if locale and filesystem encodings are different (ensure that both encodings are the same)
Even if choice (a) is not easy to implement, it is feasible and I already wrote some patches.
I don't understand how Python interact with other programs who ignore the PYTHONFSENCODING environment variable. It's like Python uses its own "locale".
Choice (b) looks easy to implement, but... there is the problem of Mac OS X. Mac OS X uses utf-8 encoding for the filesystem (and not the locale encoding), whereas it looks like the locale encoding is used for the command line arguments. See issue #4388 for more information.
There is also maybe an useful usecase of the PYTHONFSENCODING, but I don't remember which one :-)
Issues
sys.argv:
- decoded from the locale encoding
- subprocess encodes process arguments to the filesystem encoding => issue #9992
sys.path:
- decoded from the locale encoding
- import encodes paths to the filesystem encoding => issue #10014
The script name, read on the command line (eg. python script.py), is decoded using the locale encoding, whereas it is used to fill sys.path[0] (without any encoding conversion) and import encodes paths to the filesystem encoding. => issue #10039
PYTHONWARNINGS environment variable:
- decoded from the locale encoding
- subprocess encodes environment variables to the filesystem encoding => issue #9988
-- Victor Stinner http://www.haypocalc.com/
- Previous message: [Python-Dev] Patch making the current email package (mostly) support bytes
- Next message: [Python-Dev] Inconsistencies if locale and filesystem encodings are different
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]