[Python-Dev] Updates to PEP 471, the os.scandir() proposal (original) (raw)

Victor Stinner victor.stinner at gmail.com
Thu Jul 10 02:57:00 CEST 2014


Oh, since I'm proposing to add a new stat() method to DirEntry, we can optimize it. stat() can reuse lstat() result if the file is not a symlink. It simplifies is_dir().

New pseudo-code:

class DirEntry: def init(self, path, name, lstat=None, d_type=None): self.name = name self.full_name = os.path.join(path, name) # lstat is known on Windows self._lstat = lstat if lstat is not None and not stat.S_ISLNK(lstat.st_mode): # On Windows, stat() only calls os.stat() for symlinks self._stat = lstat else: self._stat = None # d_type is known on UNIX if d_type != DT_UNKNOWN: self._d_type = d_type else: # DT_UNKNOWN is not a very useful information :-p self._d_type = None

def stat(self):
    if self._stat is None:
        self._stat = os.stat(self.full_name)
    return self._stat

def lstat(self):
    if self._lstat is None:
        self._lstat = os.lstat(self.full_name)
        if self._stat is None and not stat.S_ISLNK(self._lstat.st_mode):
            self._stat = lstat
    return self._lstat

def is_dir(self):
    if self._d_type is not None:
        if self._d_type == DT_DIR:
            return True
        if self._d_type != DT_LNK:
            return False
    else:
        lstat = self.lstat()
        if stat.S_ISDIR(lstat.st_mode):
            return True
    stat = self.stat()   # if lstat() was already called, stat()

The extra caching rules are complex, that's why I suggest to not document them.

In short: is_dir() only needs an extra syscall for symlinks, for other file types it does not need any syscall.

Victor



More information about the Python-Dev mailing list