msg150561 - (view) |
Author: Terry J. Reedy (terry.reedy) *  |
Date: 2012-01-04 00:40 |
Current 3.2.2 docs: id(object) Return the “identity” of an object. This is an integer which is guaranteed to be unique and constant for this object during its lifetime. [model] hash(object) Return the hash value of the object (if it has one). Hash values are integers. They are used to quickly compare dictionary keys Suggestion: change "Hash values are integers. They ..." to "This should be an integer which is constant for this object during its lifetime. Hash values ..." Rationale: For builtin class instances, hash values are guaranteed to be constant that long, and only that long, as the default hash(ob) for object() instances is currently, for my win7, 64 bit, 3.2.2 CPython, id(ob) // 16 (the minimum object size). User class instance hashes (with custom __hash__) *should* have the same lifetime. But since Python cannot enforce that, I did not say 'guaranteed'. User code should *not* depend on a longer lifetime, just as for id() output. It seems worth implying that, as for id(), because (based on recent pydev discussion) people seems to be prone to over-generalize the current longer-term stability of number and string hashes, which itself may disappear in future releases. (see #13703) |
|
|
msg150564 - (view) |
Author: Martin v. Löwis (loewis) *  |
Date: 2012-01-04 01:17 |
-1. The hash has nothing to do with the lifetime, but with the value of an object. |
|
|
msg150572 - (view) |
Author: Terry J. Reedy (terry.reedy) *  |
Date: 2012-01-04 02:38 |
Martin, I do not understand. The default hash is based on id (as is default equality comparison), not value. Are you OK with hash values changing if the 'value' changes? My understanding is that changing hash values for objects in sets and dicts is bad, which is why mutable builtins with value-based equality do not have hash values. |
|
|
msg150573 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2012-01-04 02:40 |
You can define a __hash__ that changes if the object changes. It is not recommended, but it's possible. So I agree with Martin that your proposed clarification is wrong. (I also think that it wouldn't bring anything, either) Suggest closing as invalid/rajected. |
|
|
msg150585 - (view) |
Author: Terry J. Reedy (terry.reedy) *  |
Date: 2012-01-04 04:07 |
Given that the doc says that use of hash() is to compare dict keys, it does not seem wrong to me to suggest that hash() should be usable to do so. I believe id() and consequently hash() are unique among builtins in being run-dependent. That is currently documented for id() but not for hash(). Given that people seriously asked whether we can randomize hash() with each run, because 'people' 'expect' it to remain rather constant, it does not seem useless to clarify that it can change with each run. I am sure my wording could be improved. An alternative would be 'Hash values for built-in objects are constant for each run but not necessarily thereafter." If you take into account what people can do with special methods, some of the other entries seem more wrong that my suggestion. For instance: "len(s) Return the length (the number of items) of an object." and "str(obj ... When only object is given, this returns its nicely printable representation." These are true only for built-in objects, but the policy is to leave out the qualification. |
|
|
msg150586 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2012-01-04 04:38 |
-1 I concur with Martin. |
|
|
msg150595 - (view) |
Author: Martin v. Löwis (loewis) *  |
Date: 2012-01-04 08:17 |
> Martin, I do not understand. The default hash is based on id (as is > default equality comparison), not value. In the default implementation, the id *is* the object's value (i.e. objects, by default, only compare equal if they are identical). So the default implementation is just a special case of the more general rule that hashes need to be consistent with equality. > Are you OK with hash values changing if the 'value' changes? An object that can change its value (i.e. a mutable object) should fail to hash. |
|
|
msg150596 - (view) |
Author: Marc-Andre Lemburg (lemburg) *  |
Date: 2012-01-04 09:13 |
Terry J. Reedy wrote: > > Terry J. Reedy <tjreedy@udel.edu> added the comment: > > Martin, I do not understand. The default hash is based on id (as is default equality comparison), not value. Are you OK with hash values changing if the 'value' changes? My understanding is that changing hash values for objects in sets and dicts is bad, which is why mutable builtins with value-based equality do not have hash values. Hash values are based on the object values, not their id(). See the various type implementations as reference. The id() is only used as hash for objects which don't have a "value" (and thus cannot be compared). Given that we have the invariant "a==b => hash(a)==hash(b)" in Python, it immediately follows that hash values for objects with comparison method cannot have a lifetime - at least not within the same process and, depending how you look at it, also not in multi-process applications. |
|
|
msg150599 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2012-01-04 09:46 |
[Antoine] > Suggest closing as invalid/rajected. [Martin] > -1. The hash has nothing to do with the lifetime, > but with the value of an object. |
|
|