Issue 29715: Argparse improperly handles "-_" (original) (raw)

Created on 2017-03-03 21:07 by Max Rothman, last changed 2022-04-11 14:58 by admin.

Messages (9)
msg288929 - (view) Author: Max Rothman (Max Rothman) Date: 2017-03-03 21:07
In the case detailed below, argparse.ArgumentParser improperly parses the argument string "-_": ``` import argparse parser = argparse.ArgumentParser() parser.add_argument('first') print(parser.parse_args(['-_'])) ``` Expected behavior: prints Namespace(first='-_') Actual behavior: prints usage message The issue seems to be specific to the string "-_". Either character alone or both in the opposite order does not trigger the issue.
msg288939 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2017-03-03 22:24
Have you tried '-' plus any other character? argparse treats '-' and '--' specially, and this is a known issue.
msg288946 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2017-03-04 00:32
This is actually expected behaviour of the “argparse”, as well as general Unix CLI programs. See the documentation <https://docs.python.org/3.6/library/argparse.html#arguments-containing>. The general workaround is to use a double-dash separator: >>> parser.parse_args(['--', '-_']) Namespace(first='-_') Example with the Gnu “rm” command: $ echo "make a file" >-_ $ rm -_ rm: invalid option -- '_' Try 'rm ./-_' to remove the file '-_'. Try 'rm --help' for more information. [Exit 1] $ rm -- -_ # Double dash also works Although I suppose the error message could be improved. Currently it looks like it ignores the argument: >>> parser.parse_args(['-_']) usage: [-h] first : error: the following arguments are required: first __main__.SystemExit: 2
msg288987 - (view) Author: Max Rothman (Max Rothman) Date: 2017-03-04 17:09
Martin: huh, I didn't notice that documentation. The error message definitely could be improved. It still seems like an odd choice given that argparse knows about the expected spec, so it knows whether there are any options or not. Perhaps one could enable/disable this cautious behavior with a flag passed to ArgumentParser? It was rather surprising in my case, since I was parsing morse code and the arguments were random combinations of "-", "_", and "*", so it wasn't immediately obvious what the issue was.
msg289504 - (view) Author: paul j3 (paul.j3) * (Python triager) Date: 2017-03-12 17:45
I think that this string falls through to the last case in 'parser._parse_optional' (the first parsing loop) # it was meant to be an optional but there is no such option # in this parser (though it might be a valid option in a subparser) return None, arg_string, None It has the format of a optional flag, not a positional argument. If preceded by '--' it gets classed as argument. (In the second, main, parsing loop) Since it doesn't match any defined Actions it gets put in the list of 'extras' (as returned by 'parse_known_args'). But the parser also runs a check on required arguments, and finds the positional, 'first', was not filled. So that's the error that's raised. For example if I provide another string that fills the positional: In [5]: parser.parse_known_args(['-_','other']) Out[5]: (Namespace(first='other'), ['-_']) 'parse_args' would produce a 'error: unrecognized arguments: -_' error. I don't see how the error message could be improved without some major changes in the testing and parsing. It would either have to disallow unmatched optional's flags (and maybe break subparsers) or deduce that this 'extra' was meant for the unfilled positional. Bernard has argued that it is better to raise an error in ambiguous cases, than to make too many assumptions about what the user intended.
msg289521 - (view) Author: Max Rothman (Max Rothman) Date: 2017-03-13 02:25
I think that makes sense, but there's still an open question: what should the correct way be to allow dashes to be present at the beginning of positional arguments?
msg289530 - (view) Author: paul j3 (paul.j3) * (Python triager) Date: 2017-03-13 04:11
http://bugs.python.org/issue9334, 'argparse does not accept options taking arguments beginning with dash (regression from optparse)' is an old discussion about strings that begin with a dash and don't match defined flags. One proposal was to add a 'args_default_to_positional' parameter, and change the parsing that I described before to: + # behave more like optparse even if the argument looks like a option + if self.args_default_to_positional: + return None # instead of return None, arg_string, None There's a long discussion but nothing was changed (not even the test for negative numbers). Two work arounds still apply prog.py -- -_ # use -- to signal positional values prog.py --first=-_ # = to attach any string to optional (in my previous post I cited 'Bernard', I meant the module's original author, Steven Bethard. He's no longer actively involved in these bug/issues.)
msg289536 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2017-03-13 10:42
Max, I’m not sure if you saw the double-dash (--) workaround. IMO that is the “correct” way to do this for Unix command lines, and for the current version of “argparse”. But I guess that may be too inconvenient for your Morse Code case. Perhaps you can write your own custom sys.argv parser, or find some other argument handling library out there that doesn’t follow the usual Unix conventions. I don’t really like the proposal from Issue 9334 (classifying CLI arguments based on registered options). It seems hard to predict and specify (too complex) for only a minor use case. Although it does fix part of the other problem with option arguments, it is not a general solution. Assuming “-h” and “--help” are registered by default, how would an invocation like “prog.py -hi” be treated under the proposal (currently an error because -h does not accept an argument)? What about “prog.py -help”? What about “prog.py --h”, currently treated as an abbreviation of “--help”?
msg289555 - (view) Author: paul j3 (paul.j3) * (Python triager) Date: 2017-03-13 23:38
The change to `_parse_optional` that did go through is the ability to turn off abbreviations http://bugs.python.org/issue14910 Even that has had a few complaints, http://bugs.python.org/issue29777
History
Date User Action Args
2022-04-11 14:58:43 admin set github: 73901
2017-03-15 06:49:57 mbdevpl set title: Arparse improperly handles "-_" -> Argparse improperly handles "-_"
2017-03-13 23:38:03 paul.j3 set messages: +
2017-03-13 10:42:19 martin.panter set messages: +
2017-03-13 04:11:15 paul.j3 set messages: +
2017-03-13 02:25:13 Max Rothman set messages: +
2017-03-12 17:45:04 paul.j3 set messages: +
2017-03-12 16:54:22 paul.j3 set nosy: + paul.j3
2017-03-04 17:09:36 Max Rothman set messages: +
2017-03-04 00:32:46 martin.panter set nosy: + martin.pantermessages: +
2017-03-03 22:24:55 r.david.murray set nosy: + r.david.murraymessages: +
2017-03-03 21:07:15 Max Rothman create