[Python-Dev] Documentation Error for hash (original) (raw)

Matt Giuca matt.giuca at gmail.com
Fri Aug 29 15:07:25 CEST 2008


Note that only instances have the default hash value id(obj). This is not true in general. Most types don't implement the tphash slot and thus are not hashable. Indeed, mutable types should not implement that slot unless they know what they're doing :-)

By "instances" you mean "instances of user-defined classes"? (I carefully avoid the term "instance" on its own, since I believe that was phased out when they merged types and classes; it probably still exists in the C API but shouldn't be mentioned in Python-facing documentation).

But anyway, yes, we should make that distinction.

Sorry, I wasn't clear enough: with "not defining an equal comparison"

I meant that an equal comparison does not succeed, ie. raises an exception or returns PyNotImplemented (at the C level).

Oh OK. I didn't even realise it was "valid" or "usual" to explicitly block eq like that.

Again, the situation is better at the C level, since types don't have a default tphash implementation, so have to explicitly code such a slot in order for hash(obj) to work.

Yes but I gather that this "data model" document we are talking about is not designed for C authors, but Python authors, so it should be written for the point of view of people coding only in Python. (Only the "Extending and Embedding" and the "C API" documents are for C authors).

The documentation should probably say:

"If you implement cmp or eq on a class, also implement a hash method (and either have it raise an exception or return a valid non-changing hash value for the object)."

I agree, except maybe not for the Python 3 docs. As long as the behaviour I am observing is well-defined and not just a side-effect which could go away -- that is, if you define eq/cmp but not hash, in a user-defined class, it raises a TypeError -- then I think it isn't necessary to recommend implementing a hash method and raising a TypeError. Better just to leave as-is ("if it defines cmp()<http://docs.python.org/dev/3.0/reference/datamodel.html#object.cmp>or eq()<http://docs.python.org/dev/3.0/reference/datamodel.html#object.eq>but not hash()<http://docs.python.org/dev/3.0/reference/datamodel.html#object.hash>, its instances will not be usable as dictionary keys") and clarify the later statement.

"If you implement hash on classes, you should consider implementing eq and/or cmp as well, in order to control how dictionaries use your objects."

I don't think I agree with that. I'm not sure why you'd implement hash without eq and/or cmp, but it doesn't cause issues so we may as well not address it.

In general, it's probably best to always implement both methods on classes, even if the application will just use one of them.

Well it certainly is for new-style classes in the 2.x branch. I don't think you should implement hash in Python 3 if you just want a non-hashable object (since this is the default behaviour anyway).

A lot of my opinion here, though, which doesn't count very much -- so I'm just making suggestions.

Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20080829/2facfe3d/attachment.htm>



More information about the Python-Dev mailing list