[Python-Dev] cpython: Issue #16455: On FreeBSD and Solaris, if the locale is C, the (original) (raw)

Victor Stinner victor.stinner at gmail.com
Tue Dec 4 09:32:35 CET 2012


Hi,

2012/12/4 Christian Heimes <christian at python.org>:

Am 04.12.2012 03:23, schrieb victor.stinner:

http://hg.python.org/cpython/rev/c25635b137cc changeset: 80718:c25635b137cc parent: 80716:b845901cf702 user: Victor Stinner <victor.stinner at gmail.com> date: Tue Dec 04 01:34:47 2012 +0100 summary: Issue #16455: On FreeBSD and Solaris, if the locale is C, the ASCII/surrogateescape codec is now used, instead of the locale encoding, to decode the command line arguments. This change fixes inconsistencies with os.fsencode() and os.fsdecode() because these operating systems announces an ASCII locale encoding, whereas the ISO-8859-1 encoding is used in practice.

files: Include/unicodeobject.h | 2 +- Lib/test/testcmdlinescript.py | 9 +- Misc/NEWS | 6 + Objects/unicodeobject.c | 24 +- Python/fileutils.c | 240 +++++++++++++++++- 5 files changed, 241 insertions(+), 40 deletions(-) ... @@ -3110,7 +3110,8 @@ *surrogateescape = 0; return 0; } - if (strcmp(errors, "surrogateescape") == 0) { + if (errors == "surrogateescape" + || strcmp(errors, "surrogateescape") == 0) { *surrogateescape = 1; return 0; } Victor, That doesn't look right. :) GCC is complaining about the code: Objects/unicodeobject.c: In function 'localeerrorhandler': Objects/unicodeobject.c:3113:16: warning: comparison with string literal results in unspecified behavior [-Waddress]

Oh, I forgot to commit this change in a separated commit. It's a micro-optimization.

PyUnicode_EncodeFSDefault() calls PyUnicode_EncodeLocale(unicode, "surrogateescape"), and PyUnicode_DecodeFSDefaultAndSize() calls PyUnicode_DecodeLocaleAndSize(s, size, "surrogateescape").

I chose to compare the address because I expect that GCC generates the same address for "surrogateescape" in PyUnicode_EncodeFSDefault() and in locale_error_handler(), comparing pointers is faster than comparing the string content.

I remove this micro-optimization. The code path is only used during Python startup, and I don't expect any real speedup.

I'm also getting additional warnings in PyUnicodeFormat().

Objects/unicodeobject.c: In function 'PyUnicodeFormat': Objects/unicodeobject.c:13782:8: warning: 'arg.sign' may be used uninitialized in this function [-Wmaybe-uninitialized] Objects/unicodeobject.c:13893:33: note: 'arg.sign' was declared here Objects/unicodeobject.c:13779:12: warning: 'str' may be used uninitialized in this function [-Wmaybe-uninitialized] Objects/unicodeobject.c:13894:15: note: 'str' was declared here

These members are initialized, but it's even hard to me (author of this code) to check them. I rewrote how these members are initialized to make the warnings quiet but also to simplify the code.

Thanks for the review!

Victor

PS: I hope that I really fixed the FreeBSD/Solaris issue :-p



More information about the Python-Dev mailing list