[Python-Dev] cpython: Issue #16455: On FreeBSD and Solaris, if the locale is C, the (original) (raw)
Victor Stinner victor.stinner at gmail.com
Tue Dec 4 09:32:35 CET 2012
- Previous message: [Python-Dev] Accept just PEP-0426
- Next message: [Python-Dev] slightly misleading Popen.poll() docs
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi,
2012/12/4 Christian Heimes <christian at python.org>:
Am 04.12.2012 03:23, schrieb victor.stinner:
http://hg.python.org/cpython/rev/c25635b137cc changeset: 80718:c25635b137cc parent: 80716:b845901cf702 user: Victor Stinner <victor.stinner at gmail.com> date: Tue Dec 04 01:34:47 2012 +0100 summary: Issue #16455: On FreeBSD and Solaris, if the locale is C, the ASCII/surrogateescape codec is now used, instead of the locale encoding, to decode the command line arguments. This change fixes inconsistencies with os.fsencode() and os.fsdecode() because these operating systems announces an ASCII locale encoding, whereas the ISO-8859-1 encoding is used in practice.
files: Include/unicodeobject.h | 2 +- Lib/test/testcmdlinescript.py | 9 +- Misc/NEWS | 6 + Objects/unicodeobject.c | 24 +- Python/fileutils.c | 240 +++++++++++++++++- 5 files changed, 241 insertions(+), 40 deletions(-) ... @@ -3110,7 +3110,8 @@ *surrogateescape = 0; return 0; } - if (strcmp(errors, "surrogateescape") == 0) { + if (errors == "surrogateescape" + || strcmp(errors, "surrogateescape") == 0) { *surrogateescape = 1; return 0; } Victor, That doesn't look right. :) GCC is complaining about the code: Objects/unicodeobject.c: In function 'localeerrorhandler': Objects/unicodeobject.c:3113:16: warning: comparison with string literal results in unspecified behavior [-Waddress]
Oh, I forgot to commit this change in a separated commit. It's a micro-optimization.
PyUnicode_EncodeFSDefault() calls PyUnicode_EncodeLocale(unicode, "surrogateescape"), and PyUnicode_DecodeFSDefaultAndSize() calls PyUnicode_DecodeLocaleAndSize(s, size, "surrogateescape").
I chose to compare the address because I expect that GCC generates the same address for "surrogateescape" in PyUnicode_EncodeFSDefault() and in locale_error_handler(), comparing pointers is faster than comparing the string content.
I remove this micro-optimization. The code path is only used during Python startup, and I don't expect any real speedup.
I'm also getting additional warnings in PyUnicodeFormat().
Objects/unicodeobject.c: In function 'PyUnicodeFormat': Objects/unicodeobject.c:13782:8: warning: 'arg.sign' may be used uninitialized in this function [-Wmaybe-uninitialized] Objects/unicodeobject.c:13893:33: note: 'arg.sign' was declared here Objects/unicodeobject.c:13779:12: warning: 'str' may be used uninitialized in this function [-Wmaybe-uninitialized] Objects/unicodeobject.c:13894:15: note: 'str' was declared here
These members are initialized, but it's even hard to me (author of this code) to check them. I rewrote how these members are initialized to make the warnings quiet but also to simplify the code.
Thanks for the review!
Victor
PS: I hope that I really fixed the FreeBSD/Solaris issue :-p
- Previous message: [Python-Dev] Accept just PEP-0426
- Next message: [Python-Dev] slightly misleading Popen.poll() docs
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]