Message 152149 - Python tracker (original) (raw)
On Sat, 2012-01-28 at 03:03 +0000, Benjamin Peterson wrote:
Benjamin Peterson <benjamin@python.org> added the comment:
For the record, Barry and I agreed on what we'll be doing for stable releases [1]. David says he should have a patch soon.
[1] http://mail.python.org/pipermail/python-dev/2012-January/115892.html
I'm attaching what I've got so far (need sleep).
Attached patch is for 3.1 and adds opt-in hash randomization.
It's based on haypo's work: random-8.patch (thanks haypo!), with additional changes as seen in my backport of that to 2.7: http://bugs.python.org/issue13703#msg151847
The randomization is off by default, and must be enabled by setting a new environment variable PYTHONHASHRANDOMIZATION to a non-empty string. (if so then, PYTHONHASHSEED also still works, if provided, in the same way as in haypo's patch)
All of the various "Py_hash_t" become "long" again (Py_hash_t was added in 3.2: )
I expanded the randomization from just PyUnicodeObject to also cover PyBytesObject, and the types within datetime.
It doesn't cover numeric types; see my explanation in ; also see http://bugs.python.org/issue13703#msg151870
It doesn't yet cover the embedded copy of expat.
I moved the hash tests from test_unicode.py to test_hash.py
I tweaked the wording of the descriptions of the envvars in cmdline.rst and the manpage
I've tested it on a 32-bit box, and it successfully protects against one set of test data (four cases: assembling then reading back items by key for a dict vs set, bytes vs str, with 200000 distinct items of data which all have hash() == 0 in unmodified build; each takes about 1.5 seconds on this --with-pydebug build, vs of the order of hours).
I haven't yet benchmarked it
Only tested on Linux (Fedora x86_64 and i686). I don't know the impact on windows (e.g. startup time without the envvar vs with the env vars).
I'm seeing one failing test:
FAIL: test_clear_dict_in_ref_cycle (main.ModuleTests)
Traceback (most recent call last): File "/home/david/coding/python-hg/cpython-3.1-hash-randomization/Lib/test/test_module.py", line 79, in test_clear_dict_in_ref_cycle self.assertEqual(destroyed, [1]) AssertionError: Lists differ: [] != [1]