Issue 31475: Bug in argparse - not supporting utf8 (original) (raw)

Created on 2017-09-14 17:48 by Ali Razmjoo, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Messages (7)

msg302190 - (view)

Author: Ali Razmjoo (Ali Razmjoo) *

Date: 2017-09-14 17:48

Regarding #3468 discussion, there is the same bug was in argparse (and optparse) which fixed in this PR. utf8 is not supported in argprase module

#3478: https://github.com/python/cpython/pull/3577 current pr: https://github.com/python/cpython/pull/3577

Regards.

msg302191 - (view)

Author: R. David Murray (r.david.murray) * (Python committer)

Date: 2017-09-14 17:51

As I requested in the PR, please provide a way to reproduce the bug you are reporting.

msg302192 - (view)

Author: R. David Murray (r.david.murray) * (Python committer)

Date: 2017-09-14 17:56

Note that as far as I know without a reproducer, it is confusing to me to talk about argparse supporting or not supporting utf8. It deals only with text strings, which are unicode. Or is this a 2.7 only bug report? (Although even there it would be a question of unicode support, not utf8).

msg302379 - (view)

Author: Inada Naoki (methane) * (Python committer)

Date: 2017-09-17 18:01

You reported stack trace on Github pull request. But discussion should be made here, not in pull request.

As far as reading traceback, your problem is solved already in Python 3.6, by PEP 528.

msg302910 - (view)

Author: Inada Naoki (methane) * (Python committer)

Date: 2017-09-25 03:51

ping? May I close this issue and pull request?

msg302964 - (view)

Author: Josh Rosenberg (josh.r) * (Python triager)

Date: 2017-09-25 17:00

Based on the OP's patch, it looks like they have a problem where they have non-ASCII text in their output strings (either due to using non-ASCII switches, or using non-ASCII help documentation), but sys.stdout/sys.stderr are configured for some encoding that doesn't support said characters, so they're getting exceptions when the help message is sent to the screen automatically (e.g. by running with --help).

It's only sort of a bug in Python: Fundamentally, the problem is a script that assumes arbitrary Unicode support being run under a locale that doesn't provide it. The solution provided is bad though: It shouldn't be trying to force UTF8 output regardless of locale.

A simple repro, at least on Linux-like systems, would be to run Python with LANG=C (and no other LC variables set), then do:

import argparse
parser = argparse.ArgumentParser()
parser.add_argument('-f', help=chr(233)) # help is 'é'
parser.print_help()

While the patch as given is wrong (with the exception of Windows weirdness, blithely ignoring/second-guessing the locale is a terrible idea), it's not a terrible idea to fix this in some way; if nothing else, it might make sense to have some fallback approach when the exception is raised (e.g. encoding the output with errors='ignore' or the like) so running scriptname.py --help at least provides some output even with incompatible locales, rather than dying with an error in the help message handling code itself.

msg303407 - (view)

Author: Inada Naoki (methane) * (Python committer)

Date: 2017-09-30 12:00

We already accept PEP 529 and PEP 538. And there are #15216 (PR 2343). So I don't think we need more solutions.

History

Date

User

Action

Args

2022-04-11 14:58:52

admin

set

github: 75656

2017-09-30 13:38:44

serhiy.storchaka

set

status: open -> closed
resolution: rejected
stage: resolved

2017-09-30 12:00:59

methane

set

messages: +

2017-09-25 17:00:43

josh.r

set

status: pending -> open
nosy: + josh.r
messages: +

2017-09-25 05:19:24

serhiy.storchaka

set

status: open -> pending

2017-09-25 03:51:06

methane

set

messages: +

2017-09-17 18:01:01

methane

set

nosy: + methane
messages: +

2017-09-14 17:56:25

r.david.murray

set

messages: +

2017-09-14 17:51:37

r.david.murray

set

nosy: + r.david.murray
messages: +

2017-09-14 17:48:43

Ali Razmjoo

create