[Python-Dev] collections module (original) (raw)
Tim Peters tim.one at comcast.net
Sat Jan 10 15:49:26 EST 2004
- Previous message: [Python-Dev] collections module
- Next message: [Python-Dev] collections module
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
[Kurt B. Kaiser]
Aren't array-based implementations of queue usually based on circular buffers?
Yes, but those are generally bounded queues.
Head and tail pointers plus a count are sufficient. Then a push is: check count (resize if necessary), write, adjust head pointer, adjust count.
By analogy to a real-life queue (of, say, people), pushing is usually viewed as adding to the tail (when you get in line for movie tickets, you're likely to get punched if you add yourself to the head of the queue). I have to insist on using that terminology, just because it ties my head in knots to swap the usual meanings.
A pull is similar, using the tail pointer. Constant time.
So is the speculative Python list gimmick we're talking about, until it runs out of room. We use mildly exponential overallocation when growing a list to ensure amortized O(1) behavior regardless.
The resize is a little tricky because the gap between the tail and head pointers is what needs to grow.
Resizing is easier with the speculative Python list gimmick, and 0-based indexing (where list[0] is the current head) is identical to the list-indexing implementation Python already has. The latter is important because it's fast. Needing to "wrap around" an index to account for circularity would introduce new test+branch expense into indexing.
The current implementation of Queue (with what appears to me to be a listpop(v, 0) for get() ) involves an internal shift of the whole pointer vector, O(n), for every read of the queue. The Cookbook examples mentioned alleviate this but still with excess copying or unbounded memory usage. IMHO it's better done in listobject.
Queue.py and the speculative Python list gimmick are different issues. The overhead of mucking with locks, and Python-level method calls, in Queue.py surely overwhelms whatever savings you could get over the two-list queue implementation Josiah put in his Cookbook comment. The only "extra" expense there over a straight push-and-pop-both-at-the-right-end Python list is that each element added to the queue is involved in (no more than) one C-level pointer swap (due to the reverse() call, when the input list is turned into the output list). If the speculative Python list gimmick were implemented, Queue.py could, of course, change to use it.
I'm not up to speed on the reasons why obitem needs to float
So that indexing a Python list is as fast as it is today, and so that a mountain of C code continues to work (all C code mucking with Python lists now believes that a list contains ob_size elements, starting at ob_items and ending at ob_items+ob_size-1).
but perhaps there is a straightforward indirect relationship to the head pointer.
ob_item would be the head pointer (for my meaning for "head"), whenever the list is non-empty.
Why not have listobject grow deque functionality using circular buffer techniques?
As above, I don't see much to recommend that over the speculative Python list gimmick: indexing is slower and more complicated, resizing is more complicated, and all the listobject.c internals would get more complicated by needing to deal with that the list contents are no longer necessarily in a contiguous slice of a C vector (from the C POV, the list contents could be in two disjoint contiguous slices). It would have the advantage that resizing is never needed until you're wholly out of room, but that seems minor compared to the drawbacks.
- Previous message: [Python-Dev] collections module
- Next message: [Python-Dev] collections module
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]