[Python-Dev] Issue 11406: adding os.scandir(), a directory iterator returning stat-like info (original) (raw)

Gregory P. Smith greg at krypto.org
Tue May 14 08:39:21 CEST 2013


On Sun, May 12, 2013 at 3:04 PM, Ben Hoyt <benhoyt at gmail.com> wrote:

> And if we're creating a custom object instead, why return a 2-tuple > rather than making the entry's name an attribute of the custom object? > > To me, that suggests a more reasonable API for os.scandir() might be > for it to be an iterator over "direntry" objects: > > name (as a string) > isfile() > isdir() > islink() > stat() > cachedstat (None or a stat object)

Nice! I really like your basic idea of returning a custom object instead of a 2-tuple. And I agree with Christian that .stat() would be clearer called .lstat(). I also like your later idea of simply exposing .dirent (would be None on Windows). One tweak I'd suggest is that isfile() etc be called isfile() etc without the underscore, to match the naming of the os.path.is* functions. > That would actually make sense at an implementation > level anyway - isfile() etc would check self.cachedlstat first, and > if that was None they would check self.dirent, and if that was also > None they would raise an error. Hmm, I'm not sure about this at all. Are you suggesting that the DirEntry object's is* functions would raise an error if both cachedlstat and dirent were None? Wouldn't it make for a much simpler API to just call os.lstat() and populate cachedlstat instead? As far as I'm concerned, that'd be the point of making DirEntry.lstat() a function. In fact, I don't think .cachedlstat should be exposed to the user. They just call entry.lstat(), and it returns a cached stat or calls os.lstat() to get the real stat if required (and populates the internal cached stat value). And the entry.is* functions would call entry.lstat() if dirent was or dtype was DTUNKNOWN. This would change relatively nasty code like this: files = [] dirs = [] for entry in os.scandir(path): try: isdir = entry.isdir() except NotPresentError: st = os.lstat(os.path.join(path, entry.name)) isdir = stat.SISDIR(st) if isdir: dirs.append(entry.name) else: files.append(entry.name) Into nice clean code like this: files = [] dirs = [] for entry in os.scandir(path): if entry.isfile(): dirs.append(entry.name) else: files.append(entry.name) This change would make scandir() usable by ordinary mortals, rather than just hardcore library implementors. In other words, I'm proposing that the DirEntry objects yielded by scandir() would have .name and .dirent attributes, and .isdir(), .isfile(), .islink(), .lstat() methods, and look basically like this (though presumably implemented in C): class DirEntry: def init(self, name, dirent, lstat, path='.'): # User shouldn't need to call this, but called internally by scandir() self.name = name self.dirent = dirent self.lstat = lstat # non-public attributes self.path = path def lstat(self): if self.lstat is None: self.lstat = os.lstat(os.path.join(self.path, self.name)) return self.lstat def isdir(self): if self.dirent is not None and self.dirent.dtype != DTUNKNOWN: return self.dirent.dtype == DTDIR else: return stat.SISDIR(self.lstat().stmode) def isfile(self): if self.dirent is not None and self.dirent.dtype != DTUNKNOWN: return self.dirent.dtype == DTREG else: return stat.SISREG(self.lstat().stmode) def islink(self): if self.dirent is not None and self.dirent.dtype != DTUNKNOWN: return self.dirent.dtype == DTLNK else: return stat.SISLNK(self.lstat().stmode) Oh, and the .dirent would either be None (Windows) or would have .dtype and .dino attributes (Linux, OS X). This would make the scandir() API nice and simple to use for callers, but still expose all the information the OS provides (both the meaningful fields in dirent, and a full stat on Windows, nicely cached in the DirEntry object). Thoughts?

I like the sound of this (which sounds like what you've implemented now though I haven't looked at your code).

-gps

-Ben


Python-Dev mailing list Python-Dev at python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/greg%40krypto.org -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20130513/cebbba6a/attachment.html>



More information about the Python-Dev mailing list