[Python-Dev] gc ideas -- sparse memory (original) (raw)

Steven D'Aprano steve at pearwood.info
Sat Dec 4 11:54:31 CET 2010


Martin v. Löwis wrote:

I'm afraid I don't follow you. Unless you're suggesting some sort of esoteric object system whereby objects don't have identity (e.g. where objects are emergent properties of some sort of distributed, non-localised "information"), any object naturally has an identity -- itself. Not in Java or C#. It is in these languages possible to determine whether to references refer to the same object. However, objects don't naturally have a distinct identification (be it an integer or something else).

Surely even in Java or C#, objects have an identity even if the language doesn't provide a way to query their distinct identification. An object stored at one memory location is distinct from any other object stored at different memory locations at that same moment in time, regardless of whether or not the language gives you a convenient label for that identity. Even if that memory location can change during the lifetime of the object, at any one moment, object X is a different object from every other object.

The fact that we can even talk about "this object" versus "that object" implies that objects have identity.

To put it in Python terms, if the id() function were removed, it would no longer be possible to get the unique identification associated with an object, but you could still compare the identity of two objects using is.

Of course, I'm only talking about objects. In Java there are values which aren't objects, such as ints and floats. That's irrelevant for our discussion, because Python has no such values.

If you really want to associate unique numbers with objects in these languages, the common approach is to put them into an identity dictionary as keys.

It seems counter-productive to me to bother with an identity function which doesn't meet that constraint. If id(x) == id(y) implies nothing about x and y (they may, or may not, be the same object) then what's the point? See James' explanation: it would be possible to use this as the foundation of an identity hash table.

I'm afraid James' explanation didn't shed any light on the question to me. It seems to me that Java's IdentityHashValue[sic -- I think the correct function name is actually IdentityHashCode] is equivalent to Python's hash(), not to Python's id(), and claiming it is related to identity is misleading and confusing.

I don't think I'm alone here -- it seems to me that even among Java programmers, the same criticisms have been raised:

http://bugs.sun.com/bugdatabase/view_bug.do?bug%5Fid=6321873 http://deepakjha.wordpress.com/2008/07/31/interesting-fact-about-identityhashcode-method-in-javalangsystem-class/

Like hash(), IdentityHashCode doesn't make any promises about identity at all. Two distinct objects could have the same hash code, and a further test is needed to distinguish them.

Why would you bother using that function when you could just use x == y instead? Because in a hash table, you also need a hash value.

Well, sure, in a hash table you need a hash value. But I was talking about an id() function.

So is that it? Is IdentityHashValue (or *Code, as the case may be) just a longer name for hash()?

-- Steven



More information about the Python-Dev mailing list