msg233727 - (view) |
Author: Al Sweigart (Al.Sweigart) * |
Date: 2015-01-09 08:15 |
GrepDialog.py's findfiles() method lacks a unit test. The comments in the unit test stub in test_grep.py correctly notes that since findfiles() method does not rely on the GrepDialog class, it can be made into a function. The attached patch does the following: - Moves findfiles() to be a function in the GrepDialog.py module. - Adds a unit test for findfiles() - Adds sensible default arguments - As suggested by the previous stub comments, findfiles() returns a generator instead of a full list - Changes 'list' and 'dir' names since those are built-ins - Uses os.walk() instead of reinventing it. There were so many changes to make that I eventually just rewrote the entire findfiles function. I've checked that the new version returns the same strings as the old version. |
|
|
msg233810 - (view) |
Author: Terry J. Reedy (terry.reedy) *  |
Date: 2015-01-10 09:29 |
I have been putting off re-writing findfiles because it partly duplicates os.listdir, which should perhaps be used instead, except that a new, improved, os.scandir is in the works: PEP 471 and #22524. I believe filefiles currently searches depth first, whereas I would generally prefer breadth first. For instance, I would like all hits in /lib together and then hits in /lib/idlelib, /lib/test, /lib/tinter, etc. What do you think? |
|
|
msg233834 - (view) |
Author: Al Sweigart (Al.Sweigart) * |
Date: 2015-01-11 04:31 |
I checked with a couple grep programs and they use depth first. Which makes sense, since you'd want the return order to be something like: /a/spam.txt /a/aa/spam.txt /a/bb/spam.txt /x/spam.txt /y/spam.txt /z/spam.txt ...instead of the bread-first: /a/spam.txt /x/spam.txt /y/spam.txt /z/spam.txt /a/aa/spam.txt /a/bb/spam.txt However, it turns out this is moot since the first thing GrepDialog.py does with the returned results is sort them. |
|
|
msg336437 - (view) |
Author: Terry J. Reedy (terry.reedy) *  |
Date: 2019-02-24 06:07 |
Findfiles was more like os.walk than os.listdir. os.walk now uses os.scandir instead of os.listdir. From 3.7.2 doc: "By default, errors from the scandir() call are ignored. If optional argument onerror is specified, it should be a function; it will be called with one argument, an OSError instance. It can report the error to continue with the walk, or raise the exception to abort the walk. Note that the filename is available as the filename attribute of the exception object." We should delete the try: except: and pass in a function to print the name and error message. This could be done after getting a working patch. Defaults are not needed as findfiles is called with 3 positional args. The calling method should do the replacement of '' with os.curdir(). The patch must be incomplete as generators do not have a sort method. Replace the first 3 lines of grep_it with folder, filepat = os.path.split(path) if not folder: folder = os.curdir() filelist = sorted(findfiles(folder, filepat, self.recvar.get()) and replace 'list' with 'filelist' in the following code. Cheryl, grab the issue if you want to do this. |
|
|
msg336482 - (view) |
Author: Cheryl Sabella (cheryl.sabella) *  |
Date: 2019-02-24 20:16 |
Sure, thanks. grep was on list to add tests to, so I'll take a look at this as well. Thanks! |
|
|
msg337339 - (view) |
Author: Cheryl Sabella (cheryl.sabella) *  |
Date: 2019-03-06 20:21 |
I've opened a PR with the changes. I did several commits, one for each stage. That is, the first one adds the test, then the second one moves `findfiles` to the module level, etc. I added the `onerror` function because if a directory that doesn't exist is entered as the path, it's nice to see that message at the top of the output window instead of just seeing zero hits. Plus, I had made a test for it. :-) One change that I didn't commit is that an alternative version of findfiles would be: def findfiles(folder, pattern, recursive): prefix = '**/' if recursive else '' yield from(pathlib.Path(folder).glob(f'{prefix}{pattern}')) The tests would have to be reworked, but manual testing showed it gave the same results, albeit without the `onerror`. One other comment about the sorting. If you change the `sorted()` in `grep_it()` to `list()` when you're looking at manual results, you'll see that `list()` shows all of one directory first, then all of the first child directory, etc (which makes sense since walk does each directory at a time). It's a quick way to compare the depth-first vs breadth first results. |
|
|
msg338653 - (view) |
Author: Terry J. Reedy (terry.reedy) *  |
Date: 2019-03-23 07:46 |
Commit as-is. I will consider the Path.glob findfiles while working on #36323. Like walk, it yields in breadth first order. Unlike walk, it requires relative paths (which #37323 may need anyway) and needs code to deal with that. I like the breadth first listing from glob. Would it make any sense to give users a choice? What does unix grep do? |
|
|
msg338668 - (view) |
Author: Cheryl Sabella (cheryl.sabella) *  |
Date: 2019-03-23 11:33 |
New changeset d60f658fc0278f3fcdadec8ddcab35b8ae03e1d1 by Cheryl Sabella in branch 'master': bpo-23205: IDLE: Add tests and refactor grep's findfiles (GH-12203) https://github.com/python/cpython/commit/d60f658fc0278f3fcdadec8ddcab35b8ae03e1d1 |
|
|
msg338674 - (view) |
Author: miss-islington (miss-islington) |
Date: 2019-03-23 12:21 |
New changeset 5ab665005b7f8a21c133208f140389e3bb1a3294 by Miss Islington (bot) in branch '3.7': bpo-23205: IDLE: Add tests and refactor grep's findfiles (GH-12203) https://github.com/python/cpython/commit/5ab665005b7f8a21c133208f140389e3bb1a3294 |
|
|
msg338691 - (view) |
Author: Cheryl Sabella (cheryl.sabella) *  |
Date: 2019-03-23 17:51 |
On linux, grep does depth first, so searching for 'idle' from Lib.idlelib returns: --- cut --- help.py history.py idle.py all of idle_test/ __init__.py iomenu.py --- cut --- Although, within idle_test, the files aren't in alphabetical order. Also, as you can see __init__ is before iomenu, so underscores seem to be ignored. On SO, it looks like they suggest piping it to sort if one wants a given ordering. |
|
|