[Python-Dev] os.path.walk() lacks 'depth first' option (original) (raw)
Noah Spurrier noah@noah.org
Mon, 21 Apr 2003 15:15:09 -0700
- Previous message: [Python-Dev] Bug/feature/patch policy for optparse.py
- Next message: [Python-Dev] os.path.walk() lacks 'depth first' option
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
I like your version; although I used a different name to avoid confusion with os.path.walk.
Note that os.listdir does not include the special entries '.' and '..' even if they are present in the directory, so there is no need to remove them.
Tim Peters> for path in walk(root): Tim Peters>Or it could look like Tim Peters> for top, names in walk(root): Tim Peters>or Tim Peters> for top, dirnames, nondirnames in walk(root):
I like the idea of yielding (top, dirs, nondirs), but often I want to perform the same operations on both dirs and nondirs so separating them doesn't help that case. This seems to be a situation where there is no typical case, so my preference is for the simpler interface. It also eliminates the need to build two new lists from the list you get from os.listdir()... In fact, I prefer your first suggestion (for path in walk(root)), but that would require building a new list by prepending the basepath to each element of children because os.listdir does not return full path. So finally in this example, I just went with returning the basepath and the children (both files and directories).
Following Tom Good's example I added an option to ignore symbolic links. It would be better to detect cycles or at least prevent going higher up in the tree.
Tim Peters>obvious topdown argument, note a subtlety: when topdown is True, the caller Tim Peters>can prune the search by mutating the dirs list yielded to it. For example, This example still allows you to prune the search in Breadth first mode by removing elements from the children list. That is cool. for top, children in walk('C:/code/python', depthfirst=False): print top, children if 'CVS' in children: children.remove('CVS')
Yours, Noah
from future import generators # needed for Python 2.2
Inspired by Doug Fort from an ActiveState Python recipe.
His version didn't use recursion and didn't do depth first.
import os
def walktree (basepath=".", depthfirst=True, ignorelinks=True): """This walks a directory tree, starting from the basepath directory. This is somewhat like os.path.walk, but using generators instead of a visit function. One important difference is that walktree() defaults to DEPTH first with optional BREADTH first, whereas the os.path.walk function allows only BREADTH first. Depth first was made the default because it is safer if you are going to be modifying the directory names you visit. This avoids the problem of renaming a directory before visiting the children of that directory.
The ignorelinks option determines whether to follow symbolic links.
Some symbolic links can lead to recursive traversal cycles.
A better way would be to detect and prune cycles.
"""
children = os.listdir(basepath)
if not depthfirst:
yield basepath, children
for name in children:
fullpath = os.path.join(basepath, name)
if os.path.isdir (fullpath) and not (ignorelinks and os.path.islink(fullpath)):
for (next_basepath, next_children) in walktree (fullpath, depthfirst, ignorelinks):
yield next_basepath, next_children
if depthfirst:
yield basepath, children
def test(): for (basepath, children) in walktree(): for name in children: print os.path.join(basepath, name)
if name == 'main': test()
- Previous message: [Python-Dev] Bug/feature/patch policy for optparse.py
- Next message: [Python-Dev] os.path.walk() lacks 'depth first' option
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]