Issue 46227: add pathlib.Path.walk method (original) (raw)
Pathlib is great, yet every time I have to parse a bunch of files, I have to use os.walk and join paths by hand. That's not a lot of code but I feel like pathlib should have higher-level abstractions for all path-related functionality of os. I propose we add a Path.walk method that could look like this:
def walk(self, topdown=True, onerror=None, followlinks=False): for root, dirs, files in self.accessor.walk( self, topdown=topdown, onerror=onerror, followlinks=followlinks ): root_path = Path(root) yield ( root_path, [root_path.make_child_relpath(dir) for dir in dirs], [root_path._make_child_relpath(file) for file in files], )
Note: this version does not handle a situation when top does not exist (similar to os.walk that also doesn't handle it and just returns an empty generator)
Some people could suggest using Path.glob instead but I found it to be less convenient for some use cases and generally slower (~2.7 times slower).
timeit("list(Path('Lib').walk())", number=100, globals=globals()) 1.9074640140170231 timeit("list(Path('Lib').glob('**/*'))", number=100, globals=globals()) 5.14890358998673
The idea is interesting, and I agree that glob with a maxi wildcard is not a great solution. There is discussion on the PR about adding walk vs extending iterdir; could you post a message on discuss.python.org and sum up the the discussion? (Pull requests on the CPython repo are only used to discuss implementation, not for debating ideas or proposing features.)