Issue 10014: sys.path[0] is incorrect if PYTHONFSENCODING is used (original) (raw)

Issue10014

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/54223

classification

Title: sys.path[0] is incorrect if PYTHONFSENCODING is used
Type: Stage:
Components: Unicode Versions: Python 3.2

process

Status: closed Resolution: fixed
Dependencies: 10039 Superseder:
Assigned To: Nosy List: vstinner
Priority: normal Keywords: patch

Created on 2010-10-02 12:14 by vstinner, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
realpath_fs_encoding-2.patch vstinner,2010-10-07 23:10
Messages (7)
msg117870 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-10-02 12:14
In the following example, sys.path[0] should be '/home/SHARE/SVN/py3k\udcc3\udca9' (my locale and filesystem encodings are utf-8): $ cd /home/SHARE/SVN/py3ké $ echo "import sys; print(sys.path[0])" > x.py $ ./python x.py /home/SHARE/SVN/py3ké $ PYTHONFSENCODING=ascii ./python x.py /home/SHARE/SVN/py3ké The problem is that PySys_SetArgvEx() inserts argv[0] at sys.path[0], but argv[0] is decoded using the locale encoding (by _Py_char2wchar() in main()), whereas paths of sys.path are supposed to be encodable (and decoded) by sys.getfilesystemencoding(). argv array should be decoded using the filesystem encoding (see issue #9992) or argv[0] should be redecoded (encode to the locale encoding, and decode from the filesystem encoding, see issue #9630).
msg118088 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-10-06 23:37
> The problem is that PySys_SetArgvEx() ... Not only PySys_SetArgvEx(). There is another issue with RunMainFromImporter() which do: sys.path[0] = filename
msg118090 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-10-07 01:26
See also issue #10039.
msg118101 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-10-07 11:33
This issue depends on issue #10039.
msg118102 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-10-07 11:41
r85302: _wrealpath() and _Py_wreadlink() support surrogates in the input path. -- realpath_fs_encoding.patch: patch _wrealpath() to encode the resulting path with the filesystem encoding (with surrogateescape) instead of the locale encoding. This patch is incomplete: it doesn't fix the issue for non-Windows platforms without the realpath() function. redecode_filename.patch (from issue #10039) + realpath_fs_encoding.patch fix this issue on my Linux (Debian Sid) box.
msg118151 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-10-07 23:10
I just created Python/fileutils.c: update the patch for this new file.
msg118595 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-10-13 22:20
Fixed by r85430 (remove PYTHONFSENCODING), see #9992.
History
Date User Action Args
2022-04-11 14:57:07 admin set github: 54223
2010-10-13 22:25:10 vstinner set status: open -> closed
2010-10-13 22:20:25 vstinner set resolution: fixedmessages: +
2010-10-07 23:11:05 vstinner set files: - realpath_fs_encoding.patch
2010-10-07 23:10:58 vstinner set files: + realpath_fs_encoding-2.patchmessages: +
2010-10-07 11:41:50 vstinner set files: + realpath_fs_encoding.patchkeywords: + patchmessages: +
2010-10-07 11:33:05 vstinner set dependencies: + python é.py fails with UnicodeEncodeError if PYTHONFSENCODING is usedmessages: +
2010-10-07 01:26:20 vstinner set messages: +
2010-10-06 23:37:41 vstinner set messages: +
2010-10-02 12:14:21 vstinner create