msg254344 - (view) |
Author: Xavier de Gaye (xdegaye) *  |
Date: 2015-11-08 15:54 |
On archlinux during an upgrade, the package manager backups some files in /etc with a .pacnew extension. On my system there are 20 such files, 9 .pacnew files located in /etc and 11 .pacnew files in subdirectories of /etc. The following commands are run from /etc: $ shopt -s globstar $ ls **/*.pacnew | wc -w 20 $ ls *.pacnew |
wc -w 9 With python: $ python Python 3.6.0a0 (default:72cca30f4707, Nov 2 2015, 14:17:31) [GCC 5.2.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import glob >>> len(glob.glob('./**/*.pacnew', recursive=True)) 20 >>> len(glob.glob('*.pacnew')) 9 >>> len(glob.glob('**/*.pacnew', recursive=True)) 11 The '**/*.pacnew' pattern does not list the files in /etc, only those located in the subdirectories of /etc. |
|
msg254352 - (view) |
Author: R. David Murray (r.david.murray) *  |
Date: 2015-11-08 17:46 |
I believe this behavior matches the documentation: "If the pattern is followed by an os.sep, only directories and subdirectories match." ('the pattern' being '**') I wonder if '***.pacnew' would work. |
|
|
msg254354 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2015-11-08 18:30 |
I already don't remember if it was a deliberate design, or just implementation detail. In any case it is not documented. > I believe this behavior matches the documentation: No, it is not related. It is that './**/' will list only directories, not regular files. > I wonder if '***.pacnew' would work. No, only ** as a whole path component works. |
|
|
msg254366 - (view) |
Author: R. David Murray (r.david.murray) *  |
Date: 2015-11-09 04:39 |
Ah, I see, 'pattern' there means the whole pattern. That certainly isn't clear. |
|
|
msg254370 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2015-11-09 07:14 |
Likely it was implementation artifact. Current implementation is simpler butter fitted existing glob design. The problem was that '**/a' should list 'a' and 'd/a', but '**/' should list only 'd/', and not ''. Here is a patch that makes '**' to match also zero directories. Old tests were passed, new tests are added to cover this case. |
|
|
msg254386 - (view) |
Author: Xavier de Gaye (xdegaye) *  |
Date: 2015-11-09 13:03 |
FWIW the patch looks good to me. I find the code in glob.py difficult to read as it happily joins regular filenames together with os.path.join() or attempts to list the files contained into a regular file (sic). The attached diff makes the code more correct and easier to understand. It is meant to be applied on top of Serhiy's patch. |
|
|
msg254397 - (view) |
Author: Xavier de Gaye (xdegaye) *  |
Date: 2015-11-09 18:02 |
glob('invalid_dir/**', recursive=True) triggers the assert that was added by my patch in _rlistdir(). This new patch fixes this: when there is no magic character in the dirname part of a split(), and dirname is not an existing directory, then there is nothing to yield and the processing of pathname must stop (and thus in this case, no call is made to glob2() when basename is '**'). |
|
|
msg254412 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2015-11-09 21:19 |
New changeset 4532c4f37429 by Serhiy Storchaka in branch '3.5': Issue #25584: Fixed recursive glob() with patterns starting with '**'. https://hg.python.org/cpython/rev/4532c4f37429 New changeset 175cd763de57 by Serhiy Storchaka in branch 'default': Issue #25584: Fixed recursive glob() with patterns starting with '**'. https://hg.python.org/cpython/rev/175cd763de57 New changeset fefc10de2775 by Serhiy Storchaka in branch '3.5': Issue #25584: Added "escape" to the __all__ list in the glob module. https://hg.python.org/cpython/rev/fefc10de2775 New changeset 128e61cb3de2 by Serhiy Storchaka in branch 'default': Issue #25584: Added "escape" to the __all__ list in the glob module. https://hg.python.org/cpython/rev/128e61cb3de2 |
|
|
msg254414 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2015-11-09 21:58 |
Please open new issue for glob() optimization Xavier. |
|
|
msg254441 - (view) |
Author: Xavier de Gaye (xdegaye) *  |
Date: 2015-11-10 11:21 |
New issue 25596 entered: regular files handled as directories in the glob module. Thanks for fixing this Serhiy. |
|
|