[Python-Dev] collections module (original) (raw)

Raymond Hettinger python at rcn.com
Fri Jan 9 00:32:52 EST 2004


I would like to establish a new module for some collection classes.

The first type would be a bag, modeled after one of Smalltalk's key collection classes (similar to MultiSets in C++ and bags in Objective C).

The second type would be a high speed queue implemented using linked list blocks like we used for itertools.tee(). The interface could be as simple as push, pop, and len. Each operation ought to run about as fast a list assignment (meaning that it beats the heck out of the list.pop(0), list.append(data) version).

This module could also serve as the "one obvious place to put it" for other high performance datatypes that might be added in the future (such as fibheaps or some such).

Guido wanted this to be discussed by a number of people. While a rich set of collection classes have proven their worth in other contexts, it could be that more is not better for python. One the language's elegances is the ability to do just about anything with just lists and dicts. If you have too many mutable containers to choose from, the choice of which to use becomes less obvious.

My thoughts are that the existence of the array module, for example, has never impaired anyone's ability to learn or use lists and dicts.

Also, no one reaches for a bag unless they are counting things.

Likewise, queues are in a mental concept space distinct from dicts, sets, and such. My only worry with queues is differentiating them from Queue objects which have builtin synchronization for use in threads.

Besides, having bag and queue in a separate module means that they needn't enter anyone's consciousness until a need arises.

What do you guys think?

Raymond Hettinger

from collections import bag b = bag('banana') b.sortedbycount() [('a', 3), ('n', 2), ('b', 1)] list(b) ['a', 'a', 'a', 'b', 'n', 'n'] b.asSet() set(['a', 'b', 'n'])



More information about the Python-Dev mailing list