[Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits? (original) (raw)

Nathaniel Smith njs at pobox.com
Thu Jun 16 03:36:13 EDT 2016


On Wed, Jun 15, 2016 at 10:25 PM, Theodore Ts'o <tytso at mit.edu> wrote:

On Wed, Jun 15, 2016 at 04:12:57PM -0700, Nathaniel Smith wrote:

- It's not exactly true that the Python interpreter doesn't need cryptographic randomness to initialize SipHash -- it's more that some Python invocations need unguessable randomness (to first approximation: all those which are exposed to hostile input), and some don't. And since the Python interpreter has no idea which case it's in, and since it's unacceptable for it to break invocations that don't need unguessable hashes, then it has to err on the side of continuing without randomness. All that's fine. In practice, those Python ivocation which are exposed to hostile input are those that are started while the network are up. The vast majority of time, they are launched by the web brwoser --- and if this happens after a second or so of the system getting networking interrupts, (a) getrandom won't block, and (b) /dev/urandom and getrandom will be initialized.

Not sure what you mean about the vast majority of Python invocations being launched by the web browser? But anyway, sure, usually this isn't an issue. This is just discussing about what to do in the unlikely case when it actually has become an issue, and it's hard to be certain that this will never happen. E.g. it's entirely plausible that someone will write some cloud-init plugin that exposes an HTTP server or something. People do all kinds of weird things in VMs these days... Basically this is a question of whether we should make an (unlikely) error totally invisible to the user, and "errors should never pass silently" is right there in the Zen of Python :-).

Also, I wish people would say that this is only an issue on Linux. Again, FreeBSD's /dev/urandom will block as well if it is uninitialized. It's just that in practice, for both Linux and Freebsd, we try very hard to make sure /dev/urandom is fully initialized by the time it matters. It's just that so far, it's only on Linux when there was an attempt to use Python in the early init scripts, and in a VM and in a system where everything is modularized such that the deadlock became visible.

(I guess the way to implement this would be for the SipHash initialization code -- which runs very early -- to set some flag, and then we expose that flag in sys.something, and later in the startup sequence check for it after the warnings module is functional. Exposing the flag at the Python level would also make it possible for code like cloud-init to do its own explicit check and respond appropriately.) I really don't think it's that big a of a deal in practice, and but if you really are concerned about the very remote possibility that a Python invocation could start in early boot, and then also stick around for the long term, and then be exosed to hostile input --- what if you set the flag, and then later on, N minutes, either automatically, or via some trigger such as cloud-init --- try and see if /dev/urandom is initialized (even a few seconds later, so long as the init scripts are hanging, it should be initialized) have Python hash all of its dicts, or maybe just the non-system dicts (since those are presumably the ones mos tlikely to be exposed hostile input).

I don't think this is technically doable. There's no global list of hash tables, and Python exposes the actual hash values to user code with some guarantee that they won't change.

-n

-- Nathaniel J. Smith -- https://vorpus.org



More information about the Python-Dev mailing list