[Python-ideas] reducing multiprocessing.Queue contention (original) (raw)

Charles-François Natali cf.natali at gmail.com
Wed Jan 23 12:16:14 CET 2013


Hello,

Currently, multiprocessing.Queue put() and get() methods hold locks for the entire duration of the writing/reading to the backing Connection (which can be a pipe, unix domain socket, or whatever it's called on Windows).

For example, here's what the feeder thread does: """ else: wacquire() try: send(obj) # Delete references to object. See issue16284 del obj finally: wrelease() """

Connection.send() and Connection.recv() have to serialize the data using pickle before writing them to the underlying file descriptor. While the locking is necessary to guarantee atomic read/write (well, it's not necessary if you're writing to a pipe less than PIPE_BUF, and writes seem atomic on Windows), the locks don't have to be held while the data is serialized.

Although I didn't make any measurement, my gut feeling is that this serializing can take a non negligible part of the overall sending/receiving time, for large data items. If that's the case, then simply holding the lock for the duration of the read()/write() syscall (and not during serialization) could reduce contention in case of large data sending/receiving.

One way to do that would be to refactor the code a bit to provide maybe a (private) AtomicConnection, which would encapsulate the necessary locking: another advantage is that this would hide the platform-dependent code inside Connection (right now, Queue only uses a lock for ending on Unix platforms, since write is apparently atomic on Windows).

Thoughts?



More information about the Python-ideas mailing list