Issue 4319: optparse and non-ascii help strings (original) (raw)

(copied from the Optik bug tracker)

Related bug: http://www.mail-archive.com/python-bugs-list@python.org/msg07227.html

Hi all,

It seems to me that the workaround to the above bug in optparse.py versio 1.5.3 introduces a new bug when help strings are byte strings (as opposed to unicode) containing non-ascii characters. Consider the following script:

$ cat test.py #!/usr/bin/env python

-- coding:latin-1 --

import optparse parser = optparse.OptionParser() parser.add_option("--test",help="This does not work: é") parser.parse_args()

When called with "$ ./test.py --help", this script fails with the following traceback:

$ ./test.py -h Traceback (most recent call last): File "./test.py", line 7, in parser.parse_args() File "/usr/lib/python2.5/optparse.py", line 1385, in parse_args stop = self._process_args(largs, rargs, values) File "/usr/lib/python2.5/optparse.py", line 1429, in _process_args self._process_short_opts(rargs, values) File "/usr/lib/python2.5/optparse.py", line 1536, in _process_short_opts option.process(opt, value, values, self) File "/usr/lib/python2.5/optparse.py", line 782, in process self.action, self.dest, opt, value, values, parser) File "/usr/lib/python2.5/optparse.py", line 804, in take_action parser.print_help() File "/usr/lib/python2.5/optparse.py", line 1655, in print_help file.write(self.format_help().encode(encoding, "replace")) UnicodeDecodeError: 'ascii' codec can't decode byte 0xe9 in position 117: ordinal not in range(128)

This behaviour can be reproduced with utf-8 encoded strings as well.

If I understand correctly, line 1655 of optparse.py only works if format_help() returns an ascii byte string or a unicode string, but the call to "encoding" fails when it is a byte string containing non-ascii character.

I think this is either a bug and should be fixed, or very misleading (and should be fixed too :).

I hope to have helped even a little. Thanks for optparse, and keep up the good work!

Cheers, Antoine