Issue 8725: Python3: use ASCII for the file system encoding on initfsencoding() failure (original) (raw)

Created on 2010-05-15 12:39 by vstinner, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
fsencoding_ascii-2.patch vstinner,2010-05-16 01:13
Messages (5)
msg105804 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-05-15 12:39
I introduced initfsencoding() in #8610 to ensure that Py_FileSystemEncoding is not more NULL. In the discussion, Marc Lemburg noticed that falling back the UTF-8 on nl_langinfo(CODESET) error is a bad idea: ASCII is better (I agree). We cannot fall back to ASCII yet because there are two other problems that have to be fixed before that: - Python3 doesn't support surrogates in module filenames: see #8611 - If Py_FileSystemEncoding is NULL, encoding functions fallback to utf-8 (PyUnicode_GetDefaultEncoding()). #8715 proposes a new PyUnicode_EncodeFSDefault() function to fix this problem Attached patch is a partial fix for this issue.
msg105820 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-05-15 16:34
PyUnicode_AsEncodedString() contains a special path for the file system encoding. I don't think that it is still needed, but I don't know how to check that. => read
msg105842 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-05-16 01:13
Version 2: - #8715 has been commited: patch PyUnicode_EncodeFSDefault() - fix the documentation according the changes
msg111758 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-07-28 01:29
I tried the patch on my import_unicode branch and it doesn't work if the locale encoding is not ASCII (as the current code doesn't work if the locale encoding is not UTF-8, #8611). If Py_FileSystemUnicodeEncoding is NULL: PyUnicode_EncodeFSDefault() should use mbcstowcs() and PyUnicode_DecodeFSDefault() should use wcstombcs(). They may reuse _Py_wchar2char() and _Py_char2wchar(). "ascii" should be used in initfsencoding().
msg119180 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-10-19 23:55
initfsencoding() now raises a fatal error on get_codeset() error. Use a encoding different than the locale encoding on get_codeset() only leads to mojibake and encoding issues, it's not a good idea. Close this issue as invalid.
History
Date User Action Args
2022-04-11 14:57:01 admin set github: 52971
2010-10-19 23:55:25 vstinner set status: open -> closedresolution: not a bugmessages: +
2010-07-28 01:29:23 vstinner set messages: +
2010-05-16 01:13:13 vstinner set files: - fsencoding_ascii.patch
2010-05-16 01:13:07 vstinner set files: + fsencoding_ascii-2.patchmessages: +
2010-05-15 16:34:19 vstinner set messages: +
2010-05-15 12:40:20 vstinner set nosy: + lemburg, loewis, pitrou, Arfreverdependencies: + Python3 doesn't support locale different than utf8 and an non-ASCII path (POSIX), Create PyUnicode_EncodeFSDefault() function
2010-05-15 12:39:05 vstinner create