Issue 28718: '*' matches entire path in fnmatch (original) (raw)

Created on 2016-11-16 20:12 by Jim Nasby, last changed 2022-04-11 14:58 by admin.

Messages (8)
msg280985 - (view) Author: Jim Nasby (Jim Nasby) * Date: 2016-11-16 20:12
A '*' in fnmatch.translate is converted into '.*', which will greedily match directory separators. This doesn't match shell behavior, which is that * will only match file names: decibel@decina:[14:07]~$ls ~/tmp/*/1|head ls: /Users/decibel/tmp/*/1: No such file or directory decibel@decina:[14:07]~$ls ~/tmp/d*/base/1 head 112 From a posix standpoint, this would easily be fixed by using '[^/]*' instead of '.*'. I'm not sure how to make this work cross-platform though. It's worth noting that some programs (rsync, git) support **, which would correctly translate to '.*'.
msg281017 - (view) Author: Josh Rosenberg (josh.r) * (Python triager) Date: 2016-11-17 02:00
Presumably something like: r'(?:' + r'|'.join({re.escape(os.path.sep), re.escape(os.path.altsep)}) + r')' would cover it completely. I switched to using non-capturing groups over a character class both to deal with the fact that escaping doesn't work the same way for character classes and to cover the possibility (no idea here) that some terrible OS might have a multicharacter path separator.
msg281018 - (view) Author: Josh Rosenberg (josh.r) * (Python triager) Date: 2016-11-17 02:01
Oops, altsep is None, not the empty string when there is only one separator. And I didn't handle inverting the match. Sigh. You get the idea.
msg288608 - (view) Author: Aaron Whitehouse (aaron-whitehouse) Date: 2017-02-26 18:20
Note that somebody has forked the standard library to implement this: https://github.com/kianxineki/python-wildcard This shows that the actual changes would be pretty small (though pywildcard is based on 2.x code and does not handle the cross-platform slashes you have been discussing). It is also worth noting that the glob standard library: https://docs.python.org/3.7/library/glob.html implements a "recursive" option that has similar behaviour (* does not span path separators whereas ** does) and essentially builds this on top of fnmatch for the actual filename matching. I do not think we can change the default behaviour of fnmatch at this point, but I would like to see this behaviour triggered by an optional argument to the various functions, e.g.: fnmatch.fnmatch(filename, pattern, glob_asterisks=False) fnmatch.fnmatchcase(filename, pattern, glob_asterisks=False) fnmatch.filter(names, pattern, glob_asterisks=False) fnmatch.translate(pattern, glob_asterisks=False) In each case, if glob_asterisks (or whatever other name we came up with) is true, the behaviour would match the pywildcard behaviour, i.e.: ** matches everything * matches in one path level I look after the glob matching code in duplicity and would like to start using the standard library to do filename matching for us, but we need the above behaviour. I am happy to do the patching if there is a realistic chance of it being accepted.
msg290624 - (view) Author: Aaron Whitehouse (aaron-whitehouse) Date: 2017-03-27 16:01
Posted to the [Python-ideas] mailing list, as it is proposing a change to a standard library: https://mail.python.org/pipermail/python-ideas/2017-February/044880.html Nobody has responded so far, however. I take this as at least no vehement objection to the idea.
msg307867 - (view) Author: Alberto Galera (Alberto Galera) Date: 2017-12-08 20:09
I see that they have commented on the lib that I made a few years ago (python-wildcard). The reason for the creation of that little fork started in this issue: https://bugs.python.org/issue25734
msg339054 - (view) Author: Toon Verstraelen (Toon Verstraelen) * Date: 2019-03-28 15:59
For consistency with the corresponding feature in the glob function since Python 3.5, I would suggest to add an extra optional argument 'recursive' instead of 'glob_asterisks'. With the default recursive=False, one gets the old behavior, with recursive=True, it can handle the '**' and '*' as in pywildcard. I realize that with recursive=False, the behavior is not exactly consistent with glob, but I'd still prefer the same name for the optional argument. It is the common terminology for this type of feature. See https://en.wikipedia.org/wiki/Matching_wildcards
msg339256 - (view) Author: Toon Verstraelen (Toon Verstraelen) * Date: 2019-03-31 12:59
Just for reference, here are a few more implementations of the same idea, next to pywildcard, sometimes combined with other useful features: - https://github.com/LawfulHacker/fnmatch2 - https://github.com/demurgos/py-pathmatch - https://github.com/vidartf/globmatch - https://github.com/facelessuser/wcmatch The last one is rather active, with regular releases, last one on March 24, 2019.
History
Date User Action Args
2022-04-11 14:58:39 admin set github: 72904
2019-03-31 12:59:09 Toon Verstraelen set messages: +
2019-03-28 17:42:37 xtreak set nosy: + serhiy.storchaka
2019-03-28 15:59:15 Toon Verstraelen set nosy: + Toon Verstraelenmessages: +
2017-12-08 20:09:36 Alberto Galera set nosy: + Alberto Galeramessages: +
2017-03-27 16:01:52 aaron-whitehouse set messages: +
2017-02-26 18:20:35 aaron-whitehouse set nosy: + aaron-whitehousemessages: + title: '*' matches entire path in fnmatch.translate -> '*' matches entire path in fnmatch
2016-11-17 02:01:17 josh.r set messages: +
2016-11-17 02:00:14 josh.r set nosy: + josh.rmessages: +
2016-11-16 20:12:07 Jim Nasby create