[Python-Dev] Efficient set complement and operation on large/infinite sets. (original) (raw)

Raymond Hettinger rhettinger at ewtllc.com
Fri May 12 02:34:16 CEST 2006


Guido> Hm... Without reading though all this, I expect that you'd be Guido> better off implementing this for yourself without attempting to pull Guido> the standard library sets into the picture (especially since sets.py Guido> is obsolete as of 2.4; set and frozenset are now built-in types). Guido> You're really after rather specialized set representations. I've Guido> done this myself and as long as you stick to solving just the Guido> problem at hand, it's easy. If you want a general solution, it's Guido> hard...

I certainly would be better off in the short term, and probably the long term too. It's likely what I'll do in any case as it's much, much quicker, I only need a handful of the set operations, and I don't need to talk to anyone :-) I don't think I'm proposing something specialized. Set complement is something one learns in primary school. It's just difficult to provide in general, as you say. Aside from my own itch, which I know how to scratch, my question is whether it's worth trying to work this into Python itself. Sounds like not. There's room in the world for alternate implementations of sets, each with its own strengths and weaknesses. For example, bitsets potentially offer some speed and space economies for certain apps. Alternatve implementations will most likely start-off as third-party extension modules, and if they prove essential, they may make it to the collections module.

As for the built-in set types, I recommend leaving those alone and keeping a API as simple as possible.

The rand idea is interesting but I don't see how to implement an equiprobable hash table selection in O(1) time -- keep in mind that a set may be very sparse at the time of the selection.

Raymond



More information about the Python-Dev mailing list