[Python-Dev] Currently baking idea for dict.sequpdate(iterable, value=True) (original) (raw)

Guido van Rossum guido@python.org
Mon, 25 Nov 2002 20:12:54 -0500


If I understood correctly the not-so-veiled consideration is that sets are slower and always will be.

"The Sets module met several needs centering around set mathematics; however, for membership testing, it is so slow that it is almost always preferable to use dictionaries instead (even without this proposed method). The slowness is intrinsic because of the time to search for the contains method in the class and the time to setup a try/except to handle mutable elements. Another reason to prefer dictionaries is that there is one less thing to import and expect readers to understand. My experiences applying the Sets module indicates that it will never replace dictionaries for membership testing and will have only infrequent use for uniquification." So the purist solution would be to work long-term on improving set speed.

There seems to be a misunderstanding about the status of the sets module. It is an attempt to prototype the set API without adding new C code. Once sets are accepted as a useful datatype, and we've settled upon the API, they should be reimplemented in C.

Perhaps the current set implementation could be made faster by limiting it somewhat more? The current API attempts to be fast and flexible, but tends to favor correctness over speed where a trade-off has to be made. But maybe that's a poor way of selling a new built-in data type, and we would do better by having a truly fast implementation that is more limited? It's easier to remove limitations than to add them.

--Guido van Rossum (home page: http://www.python.org/~guido/)