(original) (raw)
I thought you are right. Here is the source code in python 2.7.11:
long
PyObject\_Hash(PyObject \*v)
{
PyTypeObject \*tp = v->ob\_type;
if (tp->tp\_hash != NULL)
return (\*tp->tp\_hash)(v);
/\* To keep to the general practice that inheriting
\* solely from object in C code should work without
\* an explicit call to PyType\_Ready, we implicitly call
\* PyType\_Ready here and then check the tp\_hash slot again
\*/
if (tp->tp\_dict == NULL) {
if (PyType\_Ready(tp) < 0)
return -1;
if (tp->tp\_hash != NULL)
return (\*tp->tp\_hash)(v);
}
if (tp->tp\_compare == NULL && RICHCOMPARE(tp) == NULL) {
return \_Py\_HashPointer(v); /\* Use address as hash value \*/
}
/\* If there's a cmp but no hash defined, the object can't be hashed \*/
return PyObject\_HashNotImplemented(v);
}
If object has hash function, it will be used. If not, \_Py\_HashPointer will be used. Which \_Py\_HashSecret are not used.
And I checked reference of \_Py\_HashSecret. Only bufferobject, unicodeobject and stringobject use \_Py\_HashSecret.
On Wed, Feb 17, 2016 at 9:54 AM, Steven D'Aprano <steve@pearwood.info> wrote:
On Tue, Feb 16, 2016 at 11:56:55AM -0800, Glenn Linderman wrote:
\> On 2/16/2016 1:48 AM, Christoph Groth wrote:
\> >Hello,
\> >
\> >Recent Python versions randomize the hashes of str, bytes and datetime
\> >objects. I suppose that the choice of these three types is the result
\> >of a compromise. Has this been discussed somewhere publicly?
\>
\> Search archives of this list... it was discussed at length.
There's a lot of discussion on the mailing list. I think that this is
the very start of it, in Dec 2011:
https://mail.python.org/pipermail/python-dev/2011-December/115116.html
and continuing into 2012, for example:
https://mail.python.org/pipermail/python-dev/2012-January/115577.html
https://mail.python.org/pipermail/python-dev/2012-January/115690.html
and a LOT more, spread over many different threads and subject lines.
You should also read the issue on the bug tracker:
http://bugs.python.org/issue13703
My recollection is that it was decided that only strings and bytes need
to have their hashes randomized, because only strings and bytes can be
used directly from user-input without first having a conversion step
with likely input range validation. In addition, changing the hash for
ints would break too much code for too little benefit: unlike strings,
where hash collision attacks on web apps are proven and easy, hash
collision attacks based on ints are more difficult and rare.
See also the comment here:
http://bugs.python.org/issue13703#msg151847
\> >I'm not a web programmer, but don't web applications also use
\> >dictionaries that are indexed by, say, tuples of integers?
\>
\> Sure, and that is the biggest part of the reason they were randomized.
But they aren't, as far as I can see:
\[steve@ando 3.6\]$ ./python -c "print(hash((23, 42, 99, 100)))"
1071302475
\[steve@ando 3.6\]$ ./python -c "print(hash((23, 42, 99, 100)))"
1071302475
Web apps can use dicts indexed by anything that they like, but unless
there is an actual attack, what does it matter? Guido makes a good point
about security here:
https://mail.python.org/pipermail/python-dev/2013-October/129181.html
\> I think hashes of all types have been randomized, not \_just\_ the list
\> you mentioned.
I'm pretty sure that's not actually the case. Using 3.6 from the repo
(admittedly not fully up to date though), I can see hash randomization
working for strings:
\[steve@ando 3.6\]$ ./python -c "print(hash('abc'))"
11601873
\[steve@ando 3.6\]$ ./python -c "print(hash('abc'))"
\-2009889747
but not for ints:
\[steve@ando 3.6\]$ ./python -c "print(hash(42))"
42
\[steve@ando 3.6\]$ ./python -c "print(hash(42))"
42
which agrees with my recollection that only strings and bytes would be
randomized.
\--
Steve
\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: https://mail.python.org/mailman/options/python-dev/shell909090%40gmail.com
彼節者有間,而刀刃者無厚;以無厚入有間,恢恢乎其於游刃必有餘地矣。
blog: http://shell909090.org/blog/
blog: http://shell909090.org/blog/