[Python-ideas] Identity dicts and sets (original) (raw)

Serhiy Storchaka storchaka at gmail.com
Thu Jan 3 21:43:29 CET 2013


On 03.01.13 20:32, Terry Reedy wrote:

Yes, and my point was that we effectively already have such things.

class Node(): pass Instances of Node wrap a dict as .dict, but are compared by wrapper identity rather than dict value. A set of such things is effectively an identity set. In 3.3+, if the instances all have the same attributes (if the .dicts all have the same keys), there is only one (not too sparse) hashed list of keys for all instances and one corresponding (not too sparse) list of values for each instance. Also, which I did not say before: if one instead represents nodes by a unique integer or string or by a list that starts with such a unique identifier, then equality is again effectively identity and a set (or sequence) of such things is effectively an identity set. This corresponds to a standard database table where records have keys, so that the identity of records is not lost when reordered or removed from the table.

Not always you can choose node type. Sometimes nodes already exists and you should just work with them.

>> so I don't see what would be gained.

You are proposing (yet-another) dict variation for use in python code.

In fact I think first of all about C code. Now using identity dict/set idiom is rather cumbersome in C code. With standard IdentityDict it should be so simple as using an ordinary dict.

That requires more justification than a few percent speedup in specialized usages. It should make python programming substantially easier in multiple use cases. I do not yet see this in regard to graph algorithm.

Identity dict/set idiom used at least in followed stdlib modules: _threading_local, xmlrpc.client, json, lib2to3 (xrange fixer), copy, unittest.mock, idlelib (rpc, remote debugger and browser), ctypes, doctest, pickle, cProfile. May be it is implicitly used in some other places or can be used.



More information about the Python-ideas mailing list