[Python-Dev] PEP 471: scandir(fd) and pathlib.Path(name, dir_fd=None) (original) (raw)

Akira Li 4kir4.1i at gmail.com
Tue Jul 1 17:58:03 CEST 2014


Ben Hoyt <benhoyt at gmail.com> writes:

Thanks, Victor.

I don't have any experience with dirfd handling, so unfortunately can't really comment here. What advantages does it bring? I notice that even os.listdir() on Python 3.4 doesn't have anything related to file descriptors, so I'd be in favour of not including support. We can always add it later. -Ben

FYI, os.listdir does support file descriptors in Python 3.3+ try:

import os os.listdir(os.open('.', os.O_RDONLY))

NOTE: os.supports_fd and os.supports_dir_fd are different sets.

See also, https://mail.python.org/pipermail/python-dev/2014-June/135265.html

-- Akira

P.S. Please, don't put your answer on top of the message you are replying to.

On Tue, Jul 1, 2014 at 3:44 AM, Victor Stinner <victor.stinner at gmail.com> wrote: Hi,

IMO we must decide if scandir() must support or not file descriptor. It's an important decision which has an important impact on the API.

To support scandir(fd), the minimum is to store dirfd in DirEntry: dirfd would be None for scandir(str). scandir(fd) must not close the file descriptor, it should be done by the caller. Handling the lifetime of the file descriptor is a difficult problem, it's better to let the user decide how to handle it. There is the problem of the limit of open file descriptors, usually 1024 but it can be lower. It can be an issue for very deep file hierarchy. If we choose to support scandir(fd), it's probably safer to not use scandir(fd) by default in os.walk() (use scandir(str) instead), wait until the feature is well tested, corner cases are well known, etc. The second step is to enhance pathlib.Path to support an optional file descriptor. Path already has methods on filenames like chmod(), exists(), rename(), etc. Example: fd = os.open(path, os.ODIRECTORY) try: for entry in os.scandir(fd): # ... use entry to benefit of entry cache: isdir(), lstatresult ... path = pathlib.Path(entry.name, dirfd=entry.dirfd) # ... use path which uses dirfd ... finally: os.close(fd) Problem: if the path object is stored somewhere and use after the loop, Path methods will fail because dirfd was closed. It's even worse if a new directory uses the same file descriptor :-/ (security issue, or at least tricky bugs!) Victor


Python-Dev mailing list Python-Dev at python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/benhoyt%40gmail.com



More information about the Python-Dev mailing list