Issue 10886: Unhelpful backtrace for multiprocessing.Queue (original) (raw)

Created on 2011-01-11 11:16 by torsten, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
mp_queue_pickle_in_main_thread.patch sbt,2011-08-29 16:10 review
Messages (8)
msg125996 - (view) Author: Torsten Landschoff (torsten) * Date: 2011-01-11 11:16
When trying to send an object via a Queue that can't be pickled, one gets a quite unhelpful traceback: Traceback (most recent call last): File "/usr/lib/python2.6/multiprocessing/queues.py", line 242, in _feed send(obj) PicklingError: Can't pickle <type 'module'>: attribute lookup __builtin__.module failed I have no idea where I am sending this. It would be helpful to get the call trace to the call to Queue.put. My workaround was to create a Queue via this function MyQueue: def MyQueue(): import cPickle def myput(obj, *args, **kwargs): cPickle.dumps(obj) return orig_put(obj, *args, **kwargs) q = Queue() orig_put, q.put = q.put, myput return q That way I get the pickle exception in the caller to put and was able to find out the offending code.
msg143154 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2011-08-29 16:10
mp_queue_pickle_in_main_thread.patch (against the default branch) fixes the problem by doing the pickling in Queue.put(). It is version of a patch for Issue 8037 (although I believe the behaviour complained about in Issue 8037 is not an actual bug). The patch also has the advantage of ensuring that weakref callbacks and __del__ methods for objects put in the queue will not be run in the background thread. (Bytes objects have trivial destructors.) This potentially prevents inconsistent state caused by forking a process while the background thread is running -- see Issue 6721.
msg143161 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-08-29 16:45
This shouldn't be a problem in Python 3.3, where the Connection classes are reimplemented in pure Python.
msg143177 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2011-08-29 20:10
> This shouldn't be a problem in Python 3.3, where the Connection classes > are reimplemented in pure Python. What should not be a problem? Changes to the implementation of Connection won't affect whether Queue.put() raises an error immediately if it gets an unpicklable argument. Nor will they affect whether weakref callbacks or __del__ methods run in a background thread, causing fork-safety issues.
msg143179 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-08-29 20:52
> Changes to the implementation of Connection won't affect whether > Queue.put() raises an error immediately if it gets an unpicklable > argument. Ah, right. Then indeed it won't make a difference.
msg182734 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2013-02-23 11:03
I'm closing, since issue #17025 proposes to do this as part of performance optimization.
msg183537 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2013-03-05 17:50
For the record, I'm posting thse benchmark numbers here (originally from issue #17025): """ with patch: $ ./python /tmp/multi_queue.py took 0.7945001125335693 seconds with 1 workers took 0.7428359985351562 seconds with 2 workers took 0.7897098064422607 seconds with 3 workers took 1.1860828399658203 seconds with 4 workers I tried Richard's suggestion of serializing the data inside put(), but this reduces performance quite notably: $ ./python /tmp/multi_queue.py took 1.412883996963501 seconds with 1 workers took 1.3212130069732666 seconds with 2 workers took 1.2271699905395508 seconds with 3 workers took 1.4817359447479248 seconds with 4 workers Although I didn't analyse it further, I guess one reason could be that if the serializing is done in put(), the feeder thread has nothing to do but keep waiting for data to be available from the buffer, send it, and block until there's more to do: basically, it almost doesn't use its time-slice, and spends its time blocking and doing context switches. """ So serializing the data from put() seems to have a significant performance impact (other benchmarks are welcome), that's something to keep in mind.
msg241835 - (view) Author: A.M. Kuchling (akuchling) * (Python committer) Date: 2015-04-23 00:05
neologix: did you intend to re-open this ticket when you made your 2013-03-05 comment? It seems to me that you didn't intend to -- your comment doesn't say 're-opening because '. I'll close it again; if you want, please re-open it and just explain why.
History
Date User Action Args
2022-04-11 14:57:11 admin set github: 55095
2015-04-23 00:05:04 akuchling set status: open -> closednosy: + akuchlingmessages: + resolution: wont fixstage: resolved
2013-03-05 17:50:39 neologix set messages: +
2013-03-05 13:31:23 neologix set status: closed -> opensuperseder: reduce multiprocessing.Queue contention ->
2013-02-23 11:03:11 neologix set status: open -> closednosy: + neologixmessages: + superseder: reduce multiprocessing.Queue contention
2011-10-06 20:20:39 neologix link issue8037 superseder
2011-08-29 20:52:21 pitrou set messages: +
2011-08-29 20:10:04 sbt set messages: +
2011-08-29 16:45:16 pitrou set nosy: + pitroumessages: +
2011-08-29 16:10:54 sbt set files: + mp_queue_pickle_in_main_thread.patchversions: + Python 3.1, Python 2.7, Python 3.2, Python 3.3, Python 3.4nosy: + sbtmessages: + keywords: + patch
2011-01-11 11:16:52 torsten create