[Python-Dev] collections module (original) (raw)

Josiah Carlson jcarlson at uci.edu
Sat Jan 10 14:06:41 EST 2004


If all the problem is about performance of pop(0) operations when the queues grow a lot, why not implement a queue as a list of lists (sublists), and all the magic will be to select an appropriate maximum size for the sublists? Wouldn't this be actually a simulation of the "extra space at the left side of the list" algorithm without touching the C list implementation?

pulls out algorithm analysis hat If your maximum size of the sublist is k, then any pop is at least O(k). If you have a total of n items in the queue, then every k items incurrs an O(n/k) penalty for the outer list shuft, or O(n/k^2) per item, or overall O(k + n/k^2) per pop for a total queue size of n, and a maximum sublist size of k.

for k > n^(1/3), this is O(k) per pop operation for k < n^(1/3), this is O(n/k^2) per pop operation, which is always greater than k

As k approaches 1, pops are horrible, but as long as k is greater than the cube root of n, the operations are reasonable. If one knows the maximal size of the queue beforehand, setting up a properly sized k would result in reasonable behavior using lists of lists.

However, using knuth's method, a dictionary fifo, or [val, [val, []]] all result in O(1) puts and gets, on either end (the [val, [val, []]] would require a previous pointer).

Another idea, for a single list solution where most of the times there are lots of puts and then mostly lots of gets, perhaps this would be fast:

An attribute of the instance stores the last operation (put or get) If the current operation is not the same as the last, first reverse the underlying list, then do append() or pop(). But that is not practical for very large queues.

It would be perfectly reasonable for a streamed puts/gets, but I don't know how common such behavior is necessary.



More information about the Python-Dev mailing list