[Python-Dev] collections module (original) (raw)

Tim Peters tim.one at comcast.net
Sat Jan 10 00:06:14 EST 2004


[Tim]

Maybe; it probably doesn't matter; serious threaded apps pass maxsize to Queue, generally equal to the # of worker threads (or a small constant multiple thereof), and the bounded queue does double-duty then as a flow-control throttle too; list.pop(0) isn't really slow when len(list) is that small.

[Josiah Carlson]

It really all depends on how fast the producers and consumers are.

Mediating differences in speed is what "flow-control throttle" means. A "serious threaded app" can't allow speed differences to lead to unbounded resource consumption, and a bounded queue is a wonderful solution to that.

In the cases where I've used Queue.Queue, I didn't have small Queues.

That doesn't demonstrate you had a sane design .

and a single conditional,

Sorry, I don't know what that might mean.

oops, I misspelled it, threading.Condition()

Ah, got it.

... The below is quite fast for most reasonable numbers of threads - it does wake all threads up that were waiting, but I find it considerably clearer, and can be done with a single synchronization primitive.

I think you mean a single synchronization instance variable. As I noted last time, Queue was written before Condition variables-- or even the threading module --existed. (Note that it still imports the ancient low-level 'thread' module instead.) There's no question about condvars being clearer than raw locks for anything fancier than a plain critical section, and I'd be in favor of rewriting Queue.py to use condvars, provided nothing important is lost.

from threading import Condition #assuming fifo exists and is fast

class CQueue: def init(self, maxsize): if maxsize > 0: self.m = maxsize else: #big enough, but Max would be better self.m = 2**64

The way the code uses it, -1 would work fine -- even on 128-bit machines .

self.q = fifo() self.l = Condition() def qsize(self): self.l.acquire() l = len(self.q) self.l.release() return l def empty(self): return self.qsize() == 0 def full(self): return self.qsize() == self.m def put(self, item, block=1): try: self.l.acquire()

That line should go before the "try:" (if the acquire() attempt raises an exception for some reason, the "finally:" block will try to release an unacquired condvar, raising a new exception, and so masking the real problem).

while block and len(self.q) == self.m: self.l.wait() self.l.release() self.l.acquire()

Note that the last two lines there aren't necessary. Or, if they are, this is much more obscure than Queue.py . wait() releases the associated lock as part of waiting, and reacquires it as part of waking up.

if len(self.q) < self.m: self.q.put(item) self.l.notifyAll()

It's not clear why this isn't plain notify(). Note that if self._q.put() raises an exception (for example, MemoryError if the box is plain of out memory), no notify is done, and so other threads blocked in put() and/or get() that could make progress aren't woken up. Queue.py endures excruciating pain to try to prevent endcase glitches like that.

else: raise Full finally: self.l.release()

... [similar stuff for get()] ...

Part of the simplification here is due to the use of a condvar. A great deal more is due to that it's simply ignoring some of the bells and whistles Queue.py has sprouted: a defined protocol for subclassing so that "the real queue" can have any concrete representation and semantics (e.g., it's easy to make a stack-like Queue subclass now, or a priority-queue-like Queue subclass) without the subclasses needing to know anything about the thread convolutions); optional timeouts on get and put; and anal attention paid to the consequences of "unexpected" exceptions.

A possible advantage of the condvar logic is that it doesn't raise Full unless the queue is actually full, or Empty unless the queue is actually empty. The current code can raise those now for another reason, and users have a hard time understanding this: self.mutex just doesn't happen to be instantly available. OTOH, that's also a feature, taking "don't block" very seriously, not even waiting for a queue operation that happens to take a long time. For example, one thread gets into self._q.put(item) and loses its timeslice. Some other thread tries to do put(item, block=False). Under Queue.py today, the latter thread raises Full at once, because it can't instantly acquire the mutex. Under the condvar scheme, it does block, and may block for a long time (the former thread has to get enough timeslices to finish its _q.put(item), get through its self._l.release(), and the latter thread may not acquire the condvar even then (any number of other threads may get it first)). The tradeoffs here are murky, but the condvar scheme would be a change in actual behavior in such cases, so they can't be ignored.

... I agree completely, most locking mechanisms are slow as hell. There are some tricks that can be done with critical sections (or its equivalent on multiple platforms); an article was written about multithreaded locking performance on IBM at least a year ago. I wonder if any of those performance-tuning semantics can be used with the underlying locks in Python.

[and later]

One thing that bothered me about using the three mutexes as given in Queue.Queue is that it relies on the fact that a thread can unlock an underlying thread.lock that it didn't acquire. To me, this is strange behavior.

All those are related. If you port Python to a new thread implementation, there's only one synchronization object you have to implement in C: thread.lock. While it may seem strange to you, it's a "universal" primitive, meaning that all other synch primitives can be built on top of it (and, in Python, they are). When Guido was designing Python, "the Python lock" was the same synch primitive used in the OS he was working on at the time, and it didn't seem strange to me either because it was also the only synch primitive we built into hardware at Kendall Square Research at the time (on that machine, Python's lock.acquire() and lock.release() were each one machine instruction!). So I didn't gripe either at the time, especially seeing as KSR machines were clearly destined to take over the world .

Note that the various synch objects supplied by threading.py (like RLocks and Conditions) are supplied via factory functions, not directly by Python classes. This was deliberate, so that someone motivated enough could replace them by faster platform-specific implementations, and not have to worry about breaking user-defined subclasses. For example, RLocks are supplied natively by Windows, so a much faster RLock gimmick could be implemented specific to Windows.

Oh, by adding a second Condition(), you don't need to notify everyone and can replace the 'while' uses with 'if'. Even faster.

You can't replace 'while' with 'if' then (think about it, and note that the docs explicitly warn that .notify() may wake more than one thread), and then you're well on the way to reinventing esema and fsema, with the current self.mutex being the lock shared by your two condvars. They'd be clearer as condvars, though, and it's always a good idea to name a "condition variable" after the condition it's modelling (like not_full or not_empty).



More information about the Python-Dev mailing list