[Python-Dev] Updates to PEP 471, the os.scandir() proposal (original) (raw)

Paul Moore p.f.moore at gmail.com
Wed Jul 9 15:12:34 CEST 2014


On 9 July 2014 13:48, Ben Hoyt <benhoyt at gmail.com> wrote:

Okay folks -- please respond: option #1 as per the current PEP 471, or option #2 with Ethan's multi-level thing tweaks as per the above?

I'm probably about 50/50 at the moment. What will swing it for me is likely error handling, so let's try both approaches with some error handling:

Rules are that we calculate the total size of all files in a tree (as returned from lstat), with files that fail to stat being logged and their size assumed to be 0.

Option 1:

def get_tree_size(path): total = 0 for entry in os.scandir(path): try: isdir = entry.is_dir() except OSError: logger.warn("Cannot stat {}".format(entry.full_name)) continue if entry.is_dir(): total += get_tree_size(entry.full_name) else: try: total += entry.lstat().st_size except OSError: logger.warn("Cannot stat {}".format(entry.full_name)) return total

Option 2: def log_err(exc): logger.warn("Cannot stat {}".format(exc.filename))

def get_tree_size(path): total = 0 for entry in os.scandir(path, info='lstat', onerror=log_err): if entry.is_dir: total += get_tree_size(entry.full_name) else: total += entry.lstat.st_size return total

On this basis, #2 wins. However, I'm slightly uncomfortable using the filename attribute of the exception in the logging, as there is nothing in the docs saying that this will give a full pathname. I'd hate to see "Unable to stat init.py"!!!

So maybe the onerror function should also receive the DirEntry object

OK, looks like option #2 is now my preferred option. My gut instinct still rebels over an API that deliberately throws information away in the default case, even though there is now an option to ask it to keep that information, but I see the logic and can learn to live with it.

Paul



More information about the Python-Dev mailing list