(original) (raw)

Eli did give his use case... a front end for a program that has a parameter "--sync", and a front end preprocessor of some sort was trying to use "--sync-foo" as an argument, and wanted "--sync" to be left in the parameters to send on to the back end program.

Design of the front-end might better be aware of back end parameters and not conflict, but the documentation could be improved, likely.

It might also be possible to add a setting to disable the prefix matching feature, for code that prefers it not be done. Whether that is better done as a global setting, or a per-parameter setting I haven't thought through. But both the constructor and the parameter definitions already accept a variable number of named parameters, so I would think it would be possible to add another, and retain backward compatibility via an appropriate default.

On 11/26/2013 9:38 AM, Guido van Rossum wrote:
I think matching on the shortest unique prefix is common for command line parsers in general, not just argparse. I believe optparse did this too, and even the venerable getopt does! I think all this originated in the original (non-Python) GNU standard for long option parsing. All that probably explains why the docs hardly touch upon it.

As to why parse\_known\_args also does this, I can see the reasoning behind this behavior: to the end user, "--sync" is a valid option, so it would be surprising if it didn't get recognized under certain conditions.

I suppose you were badly bitten by this recently? Can you tell us more about what happened?


On Tue, Nov 26, 2013 at 9:30 AM, Eli Bendersky <eliben@gmail.com> wrote:
Hello,

argparse does prefix matching as long as there are no conflicts. For example:

argparser = argparse.ArgumentParser()
argparser.add\_argument('--sync-foo', action='store\_true')
args = argparser.parse\_args()

If I pass "--sync" to this script, it recognizes it as "--sync-foo". This behavior is quite surprising although I can see the motivation for it. At the very least it should be much more explicitly documented (AFAICS it's barely mentioned in the docs).

If there's another argument registered, say "--sync-bar" the above will fail due to a conflict.

Now comes the nasty part. When using "parse\_known\_args" instead of "parse\_args", the above happens too - --sync is recognized for --sync-foo and captured by the parser. But this is wrong! The whole idea of parse\_known\_args is to parse the known args, leaving unknowns alone. This prefix matching harms more than it helps here because maybe the program we're actually acting as a front-end for (and hence using parse\_known\_args) knows about --sync and wants to get it.

Unless I'm missing something, this is a bug. But I'm also not sure whether we can do anything about it at this point, as existing code \*may\* be relying on it. The right thing to do would be to disable this prefix matching when parse\_known\_args is called.

Again, at the very least this should be documented (for parse\_known\_args not less than a warning box, IMHO).

Eli