msg244244 - (view) |
Author: (tanbro-liu) |
Date: 2015-05-28 02:20 |
On windows8.1 x64, current user name contains non-ascii characters. When executing ``pip`` in the command-line, such an error happens:: C:\Users\雪彦>pip Traceback (most recent call last): File "C:\Python27\lib\runpy.py", line 162, in _run_module_as_main "__main__", fname, loader, pkg_name) File "C:\Python27\lib\runpy.py", line 72, in _run_code exec code in run_globals File "C:\Python27\Scripts\pip.exe\__main__.py", line 9, in File "C:\Python27\lib\site-packages\pip\__init__.py", line 210, in main cmd_name, cmd_args = parseopts(args) File "C:\Python27\lib\site-packages\pip\__init__.py", line 165, in parseopts parser.print_help() File "C:\Python27\lib\optparse.py", line 1676, in print_help file.write(self.format_help().encode(encoding, "replace")) File "C:\Python27\lib\optparse.py", line 1656, in format_help result.append(self.format_option_help(formatter)) File "C:\Python27\lib\optparse.py", line 1639, in format_option_help result.append(group.format_help(formatter)) File "C:\Python27\lib\optparse.py", line 1120, in format_help result += OptionContainer.format_help(self, formatter) File "C:\Python27\lib\optparse.py", line 1091, in format_help result.append(self.format_option_help(formatter)) File "C:\Python27\lib\optparse.py", line 1080, in format_option_help result.append(formatter.format_option(option)) File "C:\Python27\lib\optparse.py", line 322, in format_option help_text = self.expand_default(option) File "C:\Python27\lib\site-packages\pip\baseparser.py", line 110, in expand_de fault return optparse.IndentedHelpFormatter.expand_default(self, option) File "C:\Python27\lib\optparse.py", line 288, in expand_default return option.help.replace(self.default_tag, str(default_value)) UnicodeEncodeError: 'ascii' codec can't encode characters in position 9-10: ordi nal not in range(128) i think, we can modify /lib/optparse.py line 288 to avoid such an error in windows:: -- return option.help.replace(self.default_tag, str(default_value)) ++ return option.help.replace( ++ self.default_tag, ++ default_value.encode(sys.getfilesystemencoding()) ++ if isinstance(default_value, uicnode) ++ else str(default_value) ++ ) |
|
|
msg326134 - (view) |
Author: Karthikeyan Singaravelan (xtreak) *  |
Date: 2018-09-23 05:48 |
Thanks for the patch. Would you like to make a GitHub PR. I think it's a problem with optparse in general while trying to have a default value with unicode character and %default in the help string. The same code is present in Python 3 but strings are unicode by default. An example code will be below : # -*- coding: utf-8 -*- from optparse import OptionParser parser = OptionParser() parser.add_option("-f", "--file", dest="filename", help="write to FILE. Default value %default", metavar="FILE", default="早上好") (options, args) = parser.parse_args() $ python3.6 ../backups/bpo24307.py --help Usage: bpo24307.py [options] Options: -h, --help show this help message and exit -f FILE, --file=FILE write to FILE. Default value 早上好 $ python2.7 ../backups/bpo24307.py --help Traceback (most recent call last): File "../backups/bpo24307.py", line 9, in (options, args) = parser.parse_args() File "/usr/local/Cellar/python@2/2.7.14_3/Frameworks/Python.framework/Versions/2.7/lib/python2.7/optparse.py", line 1400, in parse_args stop = self._process_args(largs, rargs, values) File "/usr/local/Cellar/python@2/2.7.14_3/Frameworks/Python.framework/Versions/2.7/lib/python2.7/optparse.py", line 1440, in _process_args self._process_long_opt(rargs, values) File "/usr/local/Cellar/python@2/2.7.14_3/Frameworks/Python.framework/Versions/2.7/lib/python2.7/optparse.py", line 1515, in _process_long_opt option.process(opt, value, values, self) File "/usr/local/Cellar/python@2/2.7.14_3/Frameworks/Python.framework/Versions/2.7/lib/python2.7/optparse.py", line 789, in process self.action, self.dest, opt, value, values, parser) File "/usr/local/Cellar/python@2/2.7.14_3/Frameworks/Python.framework/Versions/2.7/lib/python2.7/optparse.py", line 811, in take_action parser.print_help() File "/usr/local/Cellar/python@2/2.7.14_3/Frameworks/Python.framework/Versions/2.7/lib/python2.7/optparse.py", line 1670, in print_help file.write(self.format_help().encode(encoding, "replace")) UnicodeDecodeError: 'ascii' codec can't decode byte 0xe6 in position 148: ordinal not in range(128) Thanks |
|
|
msg327322 - (view) |
Author: Karthikeyan Singaravelan (xtreak) *  |
Date: 2018-10-08 06:01 |
Since @tanbro-liu hasn't responded I am proposing this to be an easy issue. The issue is that %default in optparse doesn't handle unicode values. The fix would be to make the patch in as a PR attributing to the original author and add a test called test_unicode_default with a unicode value as default similar to test_float_default [1] that uses a default float value. [1] https://github.com/python/cpython/blob/4a7dd30f5810e8861a3834159a222ab32d5c97d0/Lib/test/test_optparse.py#L607 |
|
|
msg327881 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2018-10-17 09:45 |
I suppose there is a similar issue in Python 3 with bytes default. Using unicode() in Python 2 will make the help string an Unicode string, and this can cause an issue with translated help string. And this will cause an issue with non-ASCII 8-bit strings. Using repr() looks a right way of solving such issues, but this will change the output for 8-bit strings. |
|
|
msg327883 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2018-10-17 10:02 |
pip is not part of Python 2, so I suggest to close this issue as "third party". I dislike changing optparse just for pip. For me, the bug should be fixed in pip, not in optparse. I see a high risk of breaking applications which currently work as expected. If the default value is a non-ASCII string, unicode() will raise a UnicodeDecodeError. |
|
|
msg327887 - (view) |
Author: Karthikeyan Singaravelan (xtreak) *  |
Date: 2018-10-17 10:16 |
@Victor I think this is an issue with optparse where it can't handle non-ASCII strings for %default that is exposed by pip. I can see similar places where non-ASCII strings can cause issue in argparse for unicode choices (). I think this is a general issue where str() is used where non-ASCII strings throw this error. I am quite new to unicode so I don't know if this issue needs to be fixed in Python 2.7 or it's an error from the user end where their script needs to be fixed? |
|
|
msg332337 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2018-12-22 08:35 |
Even if you encode the Unicode default for output, the user can not specify the same value, unless you use custom converter. For example, if you encode u"早上好" as string "\xe6\x97\xa9\xe4\xb8\x8a\xe5\xa5\xbd" (in UTF-8), the user can only specify the argument as a 8-bit string "\xe6\x97\xa9\xe4\xb8\x8a\xe5\xa5\xbd" which differs from a Unicode string u"早上好". Even if you use a custom converter which decodes 8-bit strings to Unicode, it makes sense to specify the default value as encoded string, because it will be pass to the converter. Non-ascii unicode values never supported as default values. This issue is rather a feature request than a bug report. It is too late to add new features in 2.7. The right solution is to upgrade to Python 3. Eventually, solving similar issues was one of purposes of creating Python 3. |
|
|