[Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits? (original) (raw)
Nathaniel Smith njs at pobox.com
Thu Jun 16 03:53:38 EDT 2016
- Previous message (by thread): [Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits?
- Next message (by thread): [Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Wed, Jun 15, 2016 at 11:45 PM, Barry Warsaw <barry at python.org> wrote:
On Jun 15, 2016, at 01:01 PM, Nick Coghlan wrote:
No, this is a bad idea. Asking novice developers to make security decisions they're not yet qualified to make when it's genuinely possible for us to do the right thing by default is the antithesis of good security API design, and os.urandom() is a security API (whether we like it or not - third party documentation written by the cryptographic software development community has made it so, since it's part of their guidelines for writing security sensitive code in pure Python). Regardless of what third parties have said about os.urandom(), let's look at what we have said about it. Going back to pre-churn 3.4 documentation: os.urandom(n) Return a string of n random bytes suitable for cryptographic use. This function returns random bytes from an OS-specific randomness source. The returned data should be unpredictable enough for cryptographic applications, though its exact quality depends on the OS implementation. On a Unix-like system this will query /dev/urandom, and on Windows it will use CryptGenRandom(). If a randomness source is not found, NotImplementedError will be raised. For an easy-to-use interface to the random number generator provided by your platform, please see random.SystemRandom. So we very clearly provided platform-dependent caveats on the cryptographic quality of os.urandom(). We also made a strong claim that there's a direct connection between os.urandom() and /dev/urandom on "Unix-like system(s)". We broke that particular promise in 3.5. and semi-fixed it 3.5.2. Adding new APIs is also a bad idea, since "os.urandom() is the right answer on every OS except Linux, and also the best currently available answer on Linux" has been the standard security advice for generating cryptographic secrets in pure Python code for years now, so we should only change that guidance if we have extraordinarily compelling reasons to do so, and we don't. Disagree. We have broken one long-term promise on os.urandom() ("On a Unix-like system this will query /dev/urandom") and changed another ("should be unpredictable enough for cryptographic applications, though its exact quality depends on OS implementations"). We broke the experienced Linux developer's natural and long-standing link between the API called os.urandom() and /dev/urandom. This breaks pre-3.5 code that assumes read-from-/dev/urandom semantics for os.urandom(). We have introduced churn. Predicting a future SO question such as "Can os.urandom() block on Linux?" the answer is "No in Python 3.4 and earlier, yes possibly in Python 3.5.0 and 3.5.1, no in Python 3.5.2 and the rest of the 3.5.x series, and yes possibly in Python 3.6 and beyond".
It also depends on the kernel version, since it will never block on old kernels that are missing getrandom(), but it might block on future kernels if Linux's /dev/urandom ever becomes blocking. (Ted's said that this is not going to happen now, but the only reason it isn't was that he tried to make the change and it broke some distros that are still in use -- so it seems entirely possible that it will happen a few years from now.)
We have a better answer for "cryptographically appropriate" use cases in Python 3.6 - the secrets module. Trying to make os.urandom() "the right answer on every OS" weakens the promotion of secrets as the module to use for cryptographically appropriate use cases.
IMHO it would be better to leave os.urandom() well enough alone, except for the documentation which should effectively say, a la 3.4: os.urandom(n) Return a string of n random bytes suitable for cryptographic use. This function returns random bytes from an OS-specific randomness source. The returned data should be unpredictable enough for cryptographic applications, though its exact quality depends on the OS implementation. On a Unix-like system this will query /dev/urandom, and on Windows it will use CryptGenRandom(). If a randomness source is not found, NotImplementedError will be raised. Cryptographic applications should use the secrets module for stronger guaranteed sources of randomness. For an easy-to-use interface to the random number generator provided by your platform, please see random.SystemRandom.
This is not an accurate docstring, though. The more accurate docstring for your proposed behavior would be:
os.urandom(n) Return a string of n bytes that will usually, but not always, be suitable for cryptographic use.
This function returns random bytes from an OS-specific randomness source. On non-Linux OSes, this uses the best available source of randomness, e.g. CryptGenRandom() on Windows and /dev/urandom on OS X, and thus will be strong enough for cryptographic use. However, on Linux it uses a deprecated API (/dev/urandom) which in rare cases is known to return bytes that look random, but aren't. There is no way to know when this has happened; your code will just silently stop being secure. In some unusual configurations, where Python is not configured with any source of randomness, it will raise NotImplementedError.
You should never use this function. If you need unguessable random bytes, then the 'secrets' module is always a strictly better choice -- unlike this function, it always uses the best available source of cryptographic randomness, even on Linux. Alternatively, if you need random bytes but it doesn't matter whether other people can guess them, then the 'random' module is always a strictly better choice -- it will be faster, as well as providing useful features like deterministic seeding.
In practice, your proposal means that ~all existing code that uses os.urandom becomes incorrect and should be switched to either secrets or random. This is far more churn for end-users than Nick's proposal.
...Anyway, since there's clearly going to be at least one PEP about this, maybe we should stop rehashing bits and pieces of the argument in these long threads that most people end up skipping and then rehashing again later?
-n
-- Nathaniel J. Smith -- https://vorpus.org
- Previous message (by thread): [Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits?
- Next message (by thread): [Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]