Message 267711 - Python tracker (original) (raw)
On 2016-06-07 19:46, Larry Hastings wrote:
Larry Hastings added the comment:
Thank you for summarizing the debate. It made it a lot easier to
- blocking initialization of the hash secret. This occurs regardless of script contents; at present Python simply can't be used at all in low-entropy situations. I feel that this issue is a release blocker.
Possible resolutions:
- accept possible low-entropy initialization of the hash secret; using the patches supplied here by myself and Victor.
- add a command-line flag to disable "strong" initialization of the hash secret (or revive the old -R flag).
- simply require user-space workarounds like setting PYTHONHASHSEED
The latter two approaches are unacceptable IMO. They result in a poor user experience. Python should do the "right" thing by default; the "right" thing includes not taking 90 seconds to start up.
By process of elimination, this leaves only the first approach as viable. Ergo, let's do that.
The hash secret is a 32-bit integer, even on 64-bit builds of Python. It is not and cannot be cryptographically secure. It's frankly ridiculous to fret about "strong" initialization of it at the cost of a 90 second startup time.
(For posterity: when people mention "SipHash", they're talking about the hashing algorithm used for str/dict/etc. The seed for SipHash is the "hash secret" we're talking about here.)
The secret for SipHash is composed of two 64bit integers. The entire _Py_HashSecret_t struct is 24 bytes. The remaining 8 bytes are used for XML hash randomization of libexpat. Only the manual seed with PYTHONHASHSEED is a 32bit integer which is stretched to 24 bytes with a LCG.
Christian