Message 267893 - Python tracker (original) (raw)

I am increasingly convinced that I'm right.

--

First, consider that the functions in the os module, as a rule, are a thin shell over the equivalent function provided by the operating system. If Python exposes a function called os.XYZ(), and it calls the OS, then with few exceptions it does so by calling a function called XYZ().**

This has several ramifications, and these are effectively guarantees for the Python programmer:

Now read this snippet of the documentation for os.urandom():

"The returned data should be unpredictable enough for cryptographic applications, though its exact quality depends on the OS implementation. On a Unix-like system this will query /dev/urandom, and on Windows it will use CryptGenRandom()."

That text has been in the documentation for os.urandom() since at least Python 2.6. (That's as old as we have on the web site; I didn't go hunting for older documentation.)

Thus the documentation for os.urandom():

Thus, while it's laudable to try and give the user higher-quality random bits when they call os.urandom(), you cannot degrade the behavior of the system's /dev/urandom when doing so. On Linux /dev/urandom is guaranteed to never block. This guarantee is so strong, Mr. Ts'o had to add a separate facility to Linux (/dev/random) to permit blocking. os.urandom() must replicate this behavior.

What I'm proposing is that os.urandom() may use getrandom(RND_NOBLOCK) to attempt to get higher-quality random bits, but it must not block. If it fails, it will use /dev/urandom, exactly as it is documented to do.

(Naturally this flunks the "atomic operation" test. But in the case of procuring random bits, the atomicity of its operation is obviously irrelevant.)

** The exception to this, naturally, is Windows. Internally the os module is called "posixmodule"--and this is no coincidence. AFAIK every platform supported by CPython is POSIX-based except Windows. The choice was made long ago to simulate POSIX behavior on Windows so as to present a consistent API to the programmer. If you're curious about this, and have the time, read the implementation of os.stat for Windows. What a rush!

--

Second, I invoke the "consenting adults" rule. Python provides well-documented behavior for os.urandom(). You cannot make assumptions about the use case of the caller and decide for them that they would prefer the function block in an unbounded fashion rather than provide low-quality random bits.

And yes, unbounded. As covered earlier in the thread, it only blocked for 90 seconds before systemd killed it. We don't know how long it would actually have blocked. This is completely unacceptable--for startup, for "import random", and for "os.urandom()" on Linux.

--

Third, because the os module is in general a thin wrapper over what the OS provides, I disapprove of "cryptorandom()" and "pseudorandom()" going into the os module. There are no functions with these names on any OS of which I'm aware. This is why I proposed "os.getrandom(n, block=True)". From its signature, the function it calls on your OS will be obvious, and its semantics on your OS will be documented by your OS.

Thus I am completely unwilling to add os.cryptorandom() and os.pseudorandom() in 3.5.2.