Issue 19058: test_sys.test_ioencoding_nonascii() fails with ASCII locale encoding (original) (raw)

Created on 2013-09-20 21:29 by pitrou, last changed 2022-04-11 14:57 by admin.

Files
File name Uploaded Description Edit
sys_test_ioencoding_locale.patch serhiy.storchaka,2013-09-25 09:40 review
sys_test_ioencoding.patch serhiy.storchaka,2013-09-28 20:56 review
Messages (13)
msg198174 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-09-20 21:29
The test added in fails on the new OS X buildbot: ====================================================================== FAIL: test_ioencoding_nonascii (test.test_sys.SysModuleTest) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Users/buildbot/buildarea/3.x.murray-snowleopard/build/Lib/test/test_sys.py", line 581, in test_ioencoding_nonascii self.assertEqual(out, os.fsencode(test.support.FS_NONASCII)) AssertionError: b'' != b'\xc3\xa6' http://buildbot.python.org/all/builders/AMD64%20Snow%20Leop%203.x/builds/4
msg198175 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-09-20 22:16
The test fails with ASCII locale encoding (ex: LANG= on Linux). The test should not try to display a non-ASCII character, but should check the encoding (sys.stdout.encoding) instead. The test should ensure that sys.stdout.encoding is the same with the PYTHONIOENCODING unset (python started with -E option and the current environment) and with the variable set to an empty value.
msg198181 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2013-09-21 02:09
I set LC_CTYPE to en_US.utf-8 on the buildbot, which I think is the better setting for that buildbot, so the test doesn't fail there anymore. However, the test should still be fixed (and maybe we should have a buildbot running with no language set at all).
msg198369 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-09-24 22:04
Shouldn't FS_NONASCII be None with ASCII locale encoding?
msg198370 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-09-24 22:08
Shouldn't FS_NONASCII be None with ASCII locale encoding? See the description of the variable in test.support: # FS_NONASCII: non-ASCII character encodable by os.fsencode(), # or None if there is no such character. The file system encoding an the locale encoding can be different... especially when PYTHONIOENCODING is used. The test should not use FS_NONASCII.
msg198373 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2013-09-25 01:05
Also note that on OS X I believe the fsencoding is always utf-8, but the locale can of course be something else.
msg198379 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-09-25 09:40
Indeed. Here is a patch. It uses same algorithm to obtain encodable non-ASCII string as for FS_NONASCII, but with locale encoding. It also adds new tests and simplifies existing tests.
msg198544 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-09-28 19:23
> Here is a patch. It uses same algorithm to obtain encodable > non-ASCII string as for FS_NONASCII, but with locale encoding. > It also adds new tests and simplifies existing tests. I don't like your patch. The purpose of PYTHONIOENCODING is to set sys.stdin/stdout/stderr encodings. Your patch does not check sys.stdout.encoding, but check directly the codec. Two codecs may encode the same character as the same byte sequence. Your test is skipped if the locale encoding is ASCII, whereas the purpopse of PYTHONIOENCODING is to write non-ASCII characters without having to care of the locale encoding. I would really prefer to simply check sys.stdin.encoding, sys.stdout.encoding and sys.stderr.encoding attributes. If you really want to check the codec itself, you should use known sequence, ex: 'héllo€'.encode('cp1252') gives b'h\xe9llo\x80'.
msg198552 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-09-28 20:56
Here is a patch which directly checks sys.std* attributes.
msg198553 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-09-28 21:02
> Your test is skipped if the locale encoding is ASCII, whereas the purpopse of PYTHONIOENCODING is to write non-ASCII characters without having to care of the locale encoding. This case was tested in previous test. > If you really want to check the codec itself, you should use known sequence, ex: 'héllo€'.encode('cp1252') gives b'h\xe9llo\x80'. We can't be sure that OS supports cp1252 (or any other non-default) locale.
msg198554 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-09-28 21:06
> Your patch does not check sys.stdout.encoding, but check directly the codec. Two codecs may encode the same character as the same byte sequence. Checking encoding name is too rigid. Python interpreter can normalize encoding name before assigning it to standard streams. This is implementation detail.
msg252514 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-10-08 06:36
What could you say about the recent patch Victor?
msg253008 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-10-14 16:33
> What could you say about the recent patch Victor? I'm not sure that it works in all cases. io.TextIOWrapper doesn't care to normalize the encoding name. You should use something like: encoding = codecs.lookup(encoding).name Otherwise, the test can fail if you care one of the various aliases of each encoding. Example: "UTF-8" vs "utf8" vs "utf-8".
History
Date User Action Args
2022-04-11 14:57:51 admin set github: 63258
2015-10-14 16:33:57 vstinner set messages: +
2015-10-08 06:36:41 serhiy.storchaka set messages: +
2013-09-28 21:06:28 serhiy.storchaka set messages: +
2013-09-28 21:02:10 serhiy.storchaka set messages: +
2013-09-28 20:56:59 serhiy.storchaka set files: + sys_test_ioencoding.patchmessages: +
2013-09-28 19:23:29 vstinner set messages: +
2013-09-25 09:40:27 serhiy.storchaka set files: + sys_test_ioencoding_locale.patchkeywords: + patchmessages: + stage: needs patch -> patch review
2013-09-25 01:05:28 r.david.murray set messages: +
2013-09-24 22:08:30 vstinner set messages: +
2013-09-24 22:04:01 serhiy.storchaka set messages: +
2013-09-24 21:58:02 vstinner set title: test_ioencoding_nonascii (test_sys) fails on Snow Leopard -> test_sys.test_ioencoding_nonascii() fails with ASCII locale encoding
2013-09-21 02:09:51 r.david.murray set messages: +
2013-09-20 22:16:09 vstinner set messages: +
2013-09-20 21:29:35 pitrou create