[Python-Dev] My summary of the scandir (PEP 471) (original) (raw)

Victor Stinner victor.stinner at gmail.com
Tue Jul 1 16:28:10 CEST 2014


2014-07-01 15:00 GMT+02:00 Ben Hoyt <benhoyt at gmail.com>:

(a) it doesn't call stat for you (on POSIX), so you have to check an attribute and call scandir manually if you need it,

Yes, and that's something common when you use the os module. For example, don't try to call os.fork(), os.getgid() or os.fchmod() on Windows :-) Closer to your PEP, the following OS attributes are only available on UNIX: st_blocks, st_blksize, st_rdev, st_flags; and st_file_attributes is only available on Windows.

I don't think that using lstat_result is a common need when browsing a directoy. In most cases, you only need is_dir() and the name attribute.

1) the original proposal in the current version of PEP 471, where DirEntry has an .lstat() method which calls stat() on POSIX but is free on Windows

On UNIX, does it mean that .lstat() calls os.lstat() at the first call, and then always return the same result? It would be different than os.lstat() and pathlib.Path.stat() :-( I would prefer to have the same behaviour than pathlib and os (you know, the well known consistency of Python stdlib). As I wrote, I expect a function call to always retrieve the new status.

2) Nick Coghlan's proposal on the previous thread (https://mail.python.org/pipermail/python-dev/2014-June/135261.html) suggesting an ensurelstat keyword param to scandir if you need the lstatresult value

I don't like this idea because it makes error handling more complex. The syntax to catch exceptions on an iterator is verbose (while: try: next() except ...).

Whereas calling os.lstat(entry.fullname()) is explicit and it's easy to surround it with try/except.

.lstatresult being None sometimes (on POSIX),

Don't do that, it's not how Python handles portability. We use hasattr().

would it ever really happen that readdir() would succeed but an os.stat() immediately after would fail?

Yes, it can happen. The filesystem is system-wide and shared by all users. The file can be deleted.

Really, are bytes filenames deprecated?

Yes, in all functions of the os module since Python 3.3. I'm sure because I implemented the deprecation :-)

Try open(b'test.txt', w') on Windows with python -Werror.

I think maybe they should be, as they don't work on Windows :-)

Windows has an API dedicated to bytes filenames, the ANSI API. But this API has annoying bugs: it replaces unencodable characters by question marks, and there is no option to be noticed on the encoding error.

Different users complained about that. It was decided to not change Python since Python is a light wrapper over the kernel system calls. But bytes filenames are now deprecated to advice users to use the native type for filenames on Windows: Unicode!

but the latest Python "os" docs (https://docs.python.org/3.5/library/os.html) still say that all functions that accept path names accept either str or bytes,

Maybe I forgot to update the documentation :-(

So I think scandir() should do the same thing.

You may support scandir(bytes) on Windows but you will need to emit a deprecation warning too. (which are silent by default.)

Victor



More information about the Python-Dev mailing list