Issue 31475: Bug in argparse - not supporting utf8 (original) (raw)
Created on 2017-09-14 17:48 by Ali Razmjoo, last changed 2022-04-11 14:58 by admin. This issue is now closed.
Messages (7)
Author: Ali Razmjoo (Ali Razmjoo) *
Date: 2017-09-14 17:48
Regarding #3468 discussion, there is the same bug was in argparse (and optparse) which fixed in this PR. utf8 is not supported in argprase module
#3478: https://github.com/python/cpython/pull/3577 current pr: https://github.com/python/cpython/pull/3577
Regards.
Author: R. David Murray (r.david.murray) *
Date: 2017-09-14 17:51
As I requested in the PR, please provide a way to reproduce the bug you are reporting.
Author: R. David Murray (r.david.murray) *
Date: 2017-09-14 17:56
Note that as far as I know without a reproducer, it is confusing to me to talk about argparse supporting or not supporting utf8. It deals only with text strings, which are unicode. Or is this a 2.7 only bug report? (Although even there it would be a question of unicode support, not utf8).
Author: Inada Naoki (methane) *
Date: 2017-09-17 18:01
You reported stack trace on Github pull request. But discussion should be made here, not in pull request.
As far as reading traceback, your problem is solved already in Python 3.6, by PEP 528.
Author: Inada Naoki (methane) *
Date: 2017-09-25 03:51
ping? May I close this issue and pull request?
Author: Josh Rosenberg (josh.r) *
Date: 2017-09-25 17:00
Based on the OP's patch, it looks like they have a problem where they have non-ASCII text in their output strings (either due to using non-ASCII switches, or using non-ASCII help documentation), but sys.stdout/sys.stderr are configured for some encoding that doesn't support said characters, so they're getting exceptions when the help message is sent to the screen automatically (e.g. by running with --help).
It's only sort of a bug in Python: Fundamentally, the problem is a script that assumes arbitrary Unicode support being run under a locale that doesn't provide it. The solution provided is bad though: It shouldn't be trying to force UTF8 output regardless of locale.
A simple repro, at least on Linux-like systems, would be to run Python with LANG=C (and no other LC variables set), then do:
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('-f', help=chr(233)) # help is 'é'
parser.print_help()
While the patch as given is wrong (with the exception of Windows weirdness, blithely ignoring/second-guessing the locale is a terrible idea), it's not a terrible idea to fix this in some way; if nothing else, it might make sense to have some fallback approach when the exception is raised (e.g. encoding the output with errors='ignore' or the like) so running scriptname.py --help at least provides some output even with incompatible locales, rather than dying with an error in the help message handling code itself.
Author: Inada Naoki (methane) *
Date: 2017-09-30 12:00
We already accept PEP 529 and PEP 538. And there are #15216 (PR 2343). So I don't think we need more solutions.
History
Date
User
Action
Args
2022-04-11 14:58:52
admin
set
github: 75656
2017-09-30 13:38:44
serhiy.storchaka
set
status: open -> closed
resolution: rejected
stage: resolved
2017-09-30 12:00:59
methane
set
messages: +
2017-09-25 17:00:43
josh.r
set
status: pending -> open
nosy: + josh.r
messages: +
2017-09-25 05:19:24
serhiy.storchaka
set
status: open -> pending
2017-09-25 03:51:06
methane
set
messages: +
2017-09-17 18:01:01
methane
set
nosy: + methane
messages: +
2017-09-14 17:56:25
r.david.murray
set
messages: +
2017-09-14 17:51:37
r.david.murray
set
nosy: + r.david.murray
messages: +
2017-09-14 17:48:43
Ali Razmjoo
create