Add fast path to os.[l]stat() that returns incomplete information · Issue #99726 · python/cpython (original) (raw)

A future update to Windows is bringing a new filesystem API for getting stat(-like) information more efficiently from a filename. Currently, we have to open the file, which is quite a slow operation. Being able to simply request metadata based on the path is a real improvement. My testing shows os.stat() and os.lstat() (in the case where no traversal is needed) taking less than 1/4 of their current time when using the new API. I'll link the change in a PR below.

However, the new API does not include the volume serial number, which is how we fill in the st_dev field. Adding an additional call to get the VSN takes all the time we were taking before, so there's no performance benefit.1

So I'd like to propose adding a fast=False argument to os.stat and os.lstat. When left as False, you get the current behaviour. If you pass True, we only guarantee a smaller set of data, and warn that other fields may be absent on some platforms.

Looking through the fields, I have proposed that the file type bits of st_mode (not permissions), the st_size and st_mtime[_ns] fields are the only ones that are important to guarantee.2 All the rest can stay as they are, but we then have the option to drop them from the fast path in the future.3 It's no accident that these are the APIs we already offer as other os.path functions (apart from samestat, which will have to stay on the slow path and probably needs an even slower check in order to be x-plat reliable...)

I'm not sure who cares most about this, so I'm going to leave this open for a while.

Linked PRs

  1. There is still discussion about changing this API before it releases. If that happens, the rest of this proposal is moot, unless we like the idea anyway.
  2. On Windows, we can further guarantee st_file_attributes and st_reparse_tag, as these are the raw values used to calculate the file type bits of st_mode.
  3. stat is already very fast on POSIX-ish filesystems, so it's unlikely to be an issue there, but if we wanted to specialise for network FS or similar then we'd be able to.