[Python-Dev] FIFO data structure? (original) (raw)
Agthorr agthorr@barsoom.org
Mon, 21 Apr 2003 22:42:18 -0700
- Previous message: [Python-Dev] FIFO data structure?
- Next message: [Python-Dev] FIFO data structure?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Sun, Apr 20, 2003 at 09:31:04PM -0400, Tim Peters wrote:
I'm opposed to this. The purpose of Queue is to mediate communication among threads, and a Queue.Queue rarely gets large because of its intended applications. As other recent timing posts have shown, you simply can't beat the list.append + list.pop(0) approach until a queue gets quite large (relative to the intended purpose of a Queue.Queue).
Out of curiosity, I ran some tests, comparing: list.append + list.pop(0) Queue.Queue my modified Queue.Queue
The test adds n integers to the Queue, then removes them. I use the timeit module to perform the measurements, and do not count the loading of the module or creating of the list/queue object (since presumably the user will do this extremely infrequently).
What I found was that for small n, list.append/pop is much faster than either Queue implementation. I assume this means that the bulk of the time is spent dealing with thread synchronization issues and with the overhead of using a class. It takes around one twentieth the time to complete the list.append/pop compared to either Queue implementation.
For small n, the two Queue implementations were at least in the same ballpark. Mine was roughly 25% slower for n < 10, and around 10% slower for 10 < n < 100. After that, the difference gradually declined until the circular array took the lead somewhere in the vicinity of n=2000. The performance difference didn't become large until n=10000 where the O(n^2) growth finally began to kill the list.append/pop.
Disappointed with these results, I spent some time tweaking my modified Queue.Queue to improve the performance. I create local variables in a few places, perform a bitwise-AND instead of a modulus, and initialize the circular buffer with 8 elements instead of just 1. I also now grow the circular buffer more efficiently.
This made a huge difference. My implementation now outperforms the current Queue.Queue for n > 1! It does around 1% to 4% better up until around n=500, then the advantage starts to slowly ramp up.
My updated Queue implement is here: http://www.cs.uoregon.edu/~agthorr/QueueNew.py
and my test program is here: http://www.cs.uoregon.edu/~agthorr/test.py
If you have an unusual application for a Queue.Queue where it's actually faster to do a circular-buffer gimmick (and don't believe that you do before you time it),
My application is a little program that sends simulation jobs to a small server farm. I have one thread per server that grabs jobs off the Queue and starts the remote simulation. I have a fair number of simulation parameters, and this translates into thousands of jobs getting added to the Queue. So, yes, for my particular application, the O(n^2) behavior really is a genuine problem ;)
If circular-array Queue.Queue was significantly slower for low n, I'd agree with you that the current implementation should not be changed. It doesn't appear to be a problem, though.
However, speaking of subclassing Queue: is it likely there are many user applications that subclass it in a way that would break? (i.e., they override some, but not all, of the functions intended for overriding).
-- Agthorr
- Previous message: [Python-Dev] FIFO data structure?
- Next message: [Python-Dev] FIFO data structure?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]