[Python-Dev] PEP 433: Add cloexec argument to functions creating file descriptors (original) (raw)

Charles-François Natali cf.natali at gmail.com
Sun Jan 13 11:40:34 CET 2013


Hello,

PEP: 433 Title: Add cloexec argument to functions creating file descriptors

I'm not a native English speaker, but it seems to me that the correct wording should be "parameter" (part of the function definition/prototype, whereas "argument" refers to the actual value supplied).

This PEP proposes to add a new optional argument cloexec on functions creating file descriptors in the Python standard library. If the argument is True, the close-on-exec flag will be set on the new file descriptor.

It would probably be useful to recap briefly what the close-on-exec flag does.

Also, ISTM that Windows also supports this flag. If it does, then "cloexec" might not be the best name, because it refers to the execve() Unix system call. Maybe something like "noinherit" would be clearer (although coming from a Unix background "cloexec" is crystal-clear to me :-).

On UNIX, subprocess closes file descriptors greater than 2 by default since Python 3.2 [#subprocessclose]. All file descriptors created by the parent process are automatically closed.

("in the child process")

xmlrpc.server.SimpleXMLRPCServer sets the close-on-exec flag of the listening socket, the parent class socketserver.BaseServer does not set this flag.

As has been discussed earlier, the real issue is that the server socket is not closed in the child process. Setting it cloexec would only add an extra security for multi-threaded programs.

Inherited file descriptors issues ---------------------------------

Closing the file descriptor in the parent process does not close the related resource (file, socket, ...) because it is still open in the child process.

You might want to go through the bug tracker to find examples of such issues, and list them: http://bugs.python.org/issue7213 http://bugs.python.org/issue12786 http://bugs.python.org/issue2320 http://bugs.python.org/issue3006

The list goes on. Some of those examples resulted in deadlocks.

The listening socket of TCPServer is not closed on exec(): the child process is able to get connection from new clients; if the parent closes the listening socket and create a new listening socket on the same address, it would get an "address already is used" error.

See above for the real cause.

Not closing file descriptors can lead to resource exhaustion: even if the parent closes all files, creating a new file descriptor may fail with "too many files" because files are still open in the child process.

You might want to detail the course of events (a child if forked before the parent gets a chance to close the file descriptors... EMFILE).

Leaking file descriptors is a major security vulnerability. An untrusted child process can read sensitive data like passwords and take control of the parent process though leaked file descriptors. It is for example a known vulnerability to escape from a chroot.

You might add a link to this: https://www.securecoding.cert.org/confluence/display/seccode/FIO42-C.+Ensure+files+are+properly+closed+when+they+are+no+longer+needed

It can also result in DoS (if the child process highjacks the server socket and accepts connections).

Example of vulnerabilities: http://www.openssh.com/txt/portable-keysign-rand-helper.adv http://www.securityfocus.com/archive/1/348368 http://cwe.mitre.org/data/definitions/403.html

The problem is that these flags and functions are not portable: only recent versions of operating systems support them. OCLOEXEC and SOCKCLOEXEC flags are ignored by old Linux versions and so FDCLOEXEC flag must be checked using fcntl(fd, FGETFD). If the kernel ignores OCLOEXEC or SOCKCLOEXEC flag, a call to fcntl(fd, FSETFD, flags) is required to set close-on-exec flag.

.. note:: OpenBSD older 5.2 does not close the file descriptor with close-on-exec flag set if fork() is used before exec(), but it works correctly if exec() is called without fork().

That would be really surprising, are your sure your test case is correct? Otherwise it could be a compilation issue, because I simply can't believe OpenBSD would ignore the close-on-exec flag.

This PEP only change the close-on-exec flag of file descriptors created by the Python standard library, or by modules using the standard library. Third party modules not using the standard library should be modified to conform to this PEP. The new os.setcloexec() function can be used for example.

Impacted functions: * os.forkpty() * http.server.CGIHTTPRequestHandler.runcgi()

I've opened http://bugs.python.org/issue16945 to rewrite this to use subprocess.

Impacted modules:

* multiprocessing * socketserver * subprocess * tempfile

Hum, I thought temporay file are already created with the close-on-exec flag.

* xmlrpc.server * Maybe: signal, threading

XXX Should subprocess.Popen set the close-on-exec flag on file XXX XXX descriptors of the constructor the passfds argument? XXX

What? Setting them cloexec would prevent them from being inherited in the child process!

Add a new optional cloexec argument to:

* Maybe also: os.open(), os.openpty()

Open can be passed O_CLOEXEC directly.

* Many functions of the Python standard library creating file descriptors are cannot be changed by this proposal, because adding a cloexec optional argument would be surprising and too many functions would need it. For example, os.urandom() uses a temporary file on UNIX, but it calls a function of Windows API on Windows. Adding a cloexec argument to os.urandom() would not make sense. See Always set close-on-exec flag for an incomplete list of functions creating file descriptors. * Checking if a module creates file descriptors is difficult. For example, os.urandom() creates a file descriptor on UNIX to read /dev/urandom (and closes it at exit), whereas it is implemented using a function call on Windows. It is not possible to control close-on-exec flag of the file descriptor used by os.urandom(), because os.urandom() API does not allow it.

I think that the rule of thumb should be simple: if a library opens a file descriptor which is not exposed to the user, either because it's opened and closed before returning (e.g. os.urandom()) or the file descriptor is kept private (e.g. poll(), it should be set close-on-exec. Because the FD is not handed over to the user, there's no risk of breaking applications: that's what the glibc does (e.g. for getpwnam(), so you're sure you don't leak an open FD to /etc/passwd).

Always set close-on-exec flag on new file descriptors created by Python. This alternative just changes the default value of the new cloexec argument.

In a perfect world, all FDS should have been cloexec by default. But it's too late now, I don't think we can change the default, it's going to break some applications, and would be a huge deviation from POSIX (however broken this design decision is).

Add a function to set close-on-exec flag by default ---------------------------------------------------

An alternative is to add also a function to change globally the default behaviour. It would be possible to set close-on-exec flag for the whole application including all modules and the Python standard library. This alternative is based on the Proposal and adds extra changes.

* It is not more possible to know if the close-on-exec flag will be set or not just by reading the source code.

That's really a show stopper:

""" s = socket.socket() if os.fork() == 0: # child os.execve(['myprog', 'arg1]) else: # parent [...] """

It would be impossible to now if the socket is inherited, because the behavior of the all program is affected by an - hidden - global variable. That's just too wrong.

Also, it has the same drawbacks as global variables: not thread-safe, not library-safe (i.e. if two libraries set it to conflicting values, you don't know which one is picked up).

Close file descriptors after fork ---------------------------------

This PEP does not fix issues with applications using fork() without exec(). Python needs a generic process to register callbacks which would be called after a fork, see Add an 'afterfork'_ _module. Such registry could be used to close file descriptors just after a fork().

An atfork() module would indeed be really useful, but I don't think it should be used for closing file descriptors: file descriptors are a scarce resource, so should be closed as soon as possible, explicitly. Also, you could end up actually leaking file descriptors just by keeping a reference to them in the atfork module (that's why it should probably use weakrefs, but that's another debate).

The atfork mecanism is useful for patterns where some resource is acquired/open in a library, and needs reiniatialization in a child process (e.g. locks to avoid deadlocks, or random seed), not to encourage lazy programming.

Applications using inherance of file descriptors ================================================

Most developers don't know that file descriptors are inherited by default. Most programs do not rely on inherance of file descriptors. For example, subprocess.Popen was changed in Python 3.2 to close all file descriptors greater than 2 in the child process by default. No user complained about this behavior change.

"yet" ;-)

Network servers using fork may want to pass the client socket to the child process. For example, on UNIX a CGI server pass the socket client through file descriptors 0 (stdin) and 1 (stdout) using dup2(). This specific case is not impacted by this PEP because the close-on-exec flag is never set on file descriptors smaller than 3.

CGI servers, inetd servers, etc.

Example of programs taking file descriptors from the parent process using a command line option:

* gpg: --status-fd <fd>, --logger-fd <fd>, etc. * openssl: -pass fd:<fd> * qemu: -add-fd <fd> * valgrind: --log-fd=<fd>, --input-fd=<fd>, etc. * xterm: -S <fd>

Hum, yes, but since the official way to start new process is through the subprocess module, I expect that all python code wrapping those programs is broken since subprocess "close_fds" parameter has been changed to "True" in 3.2.



More information about the Python-Dev mailing list