[Python-Dev] PEP 428: stat caching undesirable? (original) (raw)

Charles-François Natali cf.natali at gmail.com
Thu May 2 08:56:33 CEST 2013


Yes, definitely. This is exactly what my os.walk() replacement, "Betterwalk", does: https://github.com/benhoyt/betterwalk#readme

On Windows you get all stat information from iterating the directory entries (FindFirstFile etc). And on Linux most of the time you get enough for os.walk() not to need an extra stat (though it does depend on the file system). I still hope to clean up Betterwalk and make a C version so we can use it in the standard library. In many cases it speeds up os.walk() by several times, even an order of magnitude in some cases. I intend for it to be a drop-in replacement for os.walk(), just faster.

Actually, there's Gregory's scandir() implementation (returning a generator to be able to cope with large directories) on it's way:

http://bugs.python.org/issue11406

It's already been suggested to make it return a tuple (with d_type). I'm sure a review of the code (especially the Windows implementation) will be welcome.



More information about the Python-Dev mailing list