bpo-11874 argparse assertion failure by wimglenn · Pull Request #1826 · python/cpython (original) (raw)
import argparse
parser = argparse.ArgumentParser('prog'*8)
parser.add_argument('--proxy', metavar='<http[s]://example:1234>')
args = parser.parse_args(['--help'])
The script above triggers assertion errors from argparse library code (specifically, from here).
This bug that has been floating around for years, and has already bitten several people before it bit me. See for example:
http://bugs.python.org/issue25058
http://bugs.python.org/issue14046
http://bugs.python.org/issue24089
http://bugs.python.org/issue11874
There are a few ways we could deal with it:
- Just remove the asserts, which are bogus.
- Disallow using characters
(
,[
,)
,]
in metavar strings. - Tighten up the part_regexp pattern.
- Don't use regex.
1 - The only consequence here would be that wrapped usage messages might be jacked-up if people use characters like )
and ]
in their metavar strings. But this seems like a sloppy and poor resolution.
2 - a sound approach, but it's not backwards compat. Using those troublesome characters in metavars only causes problems when there is line-wrapping, so users may have existing scripts which don't currently hit the issue, and those scripts would suddenly start failing. That's a deal-breaker.
4 - probably the most ideal approach, but it requires rewriting a large chunk of argparse
code. Some other guys tried this in issue11874. Their patches have just been sitting there for 5 years with little interest, so I'm hesitant to go down that route again.
Went with 3 in this PR. The tightened up regex also matches a space (or end) after the closing bracket. In fact it is not possible to solve the underlying issue exactly with regex, since metavar can theoretically be any arbitrary string. After my patch, the AssertionError
can still fire if a metavar contains a substring like ] -v [
. That would be a much rarer/pathological case, unlike the current failure mode which users can and will bump into fairly easily. argparse.py
code is not pretty but the test coverage is quite good (1500+ tests) so I'm confident that improving the regex at least doesn't cause any extra issues.
This is the "practicality beats purity" approach.