Issue 4106: multiprocessing occasionally spits out exception during shutdown (original) (raw)

process

Status:	closed	Resolution:	fixed
Dependencies:		Superseder:
Assigned To:	jnoller	Nosy List:	Thorney, Yaniv.Aknin, bobbyi, dpranke, gdb, jnoller, mattheww, pitrou, python-dev, tobami, vinay.sajip
Priority:	normal	Keywords:	patch

Created on 2008-10-11 03:30 by skip.montanaro, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
test_proc.py	skip.montanaro,2008-10-11 03:30
test_mult.py	Thorney,2011-01-18 23:23	Example that spits out exception
mpqshutdown.patch	pitrou,2011-08-24 20:02

Messages (22)
msg74656 - (view)	Author: Skip Montanaro (skip.montanaro) *	Date: 2008-10-11 03:30
I worked up a simple example of using the external processing module (0.52) for a friend at work today. I noticed some cases where it raised exceptions during exit. Not all the time, but not infrequently either. This evening I tweaked it for the 2.6 multiprocessing module's API and tried it out. I ran it in a large loop: for i in (range500);doecho′!′(range 500) ; do echo '!'(range500);doecho′!′i python test_proc.py end \| egrep '!' Most of the time all I see are the '!' lines from the echo command. Every once in awhile I see a traceback though. For example: Exception in thread QueueFeederThread (most likely raised during interpreter shutdown): Traceback (most recent call last): File "/Users/skip/local/lib/python2.7/threading.py", line 522, in __bootstrap_inner File "/Users/skip/local/lib/python2.7/threading.py", line 477, in run File "/Users/skip/local/lib/python2.7/multiprocessing/queues.py", line 233, in _feed <type 'exceptions.TypeError'>: 'NoneType' object is not callable This occurred once in approximately 1500 runs of the script (three times through the above shell loop). The script used to trigger this exception is attached.
msg74657 - (view)	Author: Skip Montanaro (skip.montanaro) *	Date: 2008-10-11 03:31
Oh, the range command used in the shell for loop is analogous to Python's range() builtin function.
msg74658 - (view)	Author: Skip Montanaro (skip.montanaro) *	Date: 2008-10-11 12:28
Got another one just now, but with just the note about the exception in the queue feeder thread. The traceback was swallowed.
msg74659 - (view)	Author: Skip Montanaro (skip.montanaro) *	Date: 2008-10-11 12:36
Final comment before I see some feedback from the experts. I have this code in the worker function's loop: # quick pause to allow other stuff to happen a bit randomly t = 0.1 * random.random() time.sleep(t) If I eliminate the sleep altogether pretty much all hell breaks loose. As I reduce the sleep time it gets noisier and noisier. I switched to a fixed sleep time and reduced it as far as time.sleep(0.00015625) At that point it was complaining about killing worker processes on many of the runs, maybe 1 out of every 5 or 10 runs. I suppose the moral of the story is to not use multiprocessing except when you have long-running tasks.
msg80415 - (view)	Author: Jesse Noller (jnoller) *	Date: 2009-01-23 14:47
Skip, using this: while ((x++ < 500)) ; do echo '!'$i ; ./python.exe test_proc.py; done \| egrep '!' I don't see the exception in python-trunk, freshly compiled. It could be an OS thing (I'm on OS/X) - I just want to confirm that you're still seeing this on trunk
msg80417 - (view)	Author: Jesse Noller (jnoller) *	Date: 2009-01-23 15:26
Ah ha. I see it if I run it with the loop set to 3000 - it is pretty rare.
msg109575 - (view)	Author: Greg Brockman (gdb)	Date: 2010-07-08 19:28
For what it's worth, I think I have a simpler reproducer of this issue. Using freshly-compiled python-from-trunk (as well as multiprocessing-from-trunk), I get tracebacks from the following about 30% of the time: """ import multiprocessing, time def foo(x): time.sleep(3) multiprocessing.Pool(1).apply(foo, [1]) """ My tracebacks are of the form: """ Exception in thread Thread-1 (most likely raised during interpreter shutdown): Traceback (most recent call last): File "/usr/local/lib/python2.7/threading.py", line 530, in __bootstrap_inner File "/usr/local/lib/python2.7/threading.py", line 483, in run File "/usr/local/lib/python2.7/multiprocessing/pool.py", line 272, in _handle_workers <type 'exceptions.TypeError'>: 'NoneType' object is not callable """
msg109579 - (view)	Author: Jesse Noller (jnoller) *	Date: 2010-07-08 19:40
Greg - what platform?
msg109580 - (view)	Author: Greg Brockman (gdb)	Date: 2010-07-08 19:43
I'm on Ubuntu 10.04, 64 bit.
msg109584 - (view)	Author: Jesse Noller (jnoller) *	Date: 2010-07-08 19:50
Greg - this is actually a different exception then the original bug report; could you please file a new issue with the information you've provided? I'm going to need to find a 64bit ubuntu box as I don't have one right now.
msg109589 - (view)	Author: Greg Brockman (gdb)	Date: 2010-07-08 20:08
Sure thing. See http://bugs.python.org/issue9207.
msg126502 - (view)	Author: Brian Thorne (Thorney)	Date: 2011-01-18 23:23
With the example script attached I see the exception every time. On Ubuntu 10.10 with Python 2.6 Since the offending line in multiprocesing/queues.py (233) is a debug statement, just commenting it out seems to stop this exception. Looking at the util file shows the logging functions to be all of the form: if _logger: _logger.log(... Could it be possible that after the check the _logger global (or the debug function) is destroyed by the exit handler? Can we convince them to stick around until such a time that they cannot be called? Adding a small delay before joining also seems to work, but is ugly. Why should another Process have to have a minimum amount of work to not throw an exception?
msg126525 - (view)	Author: Jesse Noller (jnoller) *	Date: 2011-01-19 14:22
On Tue, Jan 18, 2011 at 6:23 PM, Brian Thorne <report@bugs.python.org> wrote: > > Brian Thorne <hardbyte@gmail.com> added the comment: > > With the example script attached I see the exception every time. On Ubuntu 10.10 with Python 2.6 > > Since the offending line in multiprocesing/queues.py (233) is a debug statement, just commenting it out seems to stop this exception. > > Looking at the util file shows the logging functions to be all of the form: > > if _logger: > _logger.log(... > > Could it be possible that after the check the _logger global (or the debug function) is destroyed by the exit handler? Can we convince them to stick around until such a time that they cannot be called? > > Adding a small delay before joining also seems to work, but is ugly. Why should another Process have to have a minimum amount of work to not throw an exception? See http://bugs.python.org/issue9207 - but yes, the problem is that the VM is nuking our imported modules before all the processes are shutdown.
msg135106 - (view)	Author: Miquel Torres (tobami)	Date: 2011-05-04 09:51
I can confirm this but with Python 2.7.1 on Ubuntu 11.04 64bit My code was working with a queue that was being fed a two-string tuple. When i changed it to contain my custom Objects, it still worked correctly, but the main program doesn't end until it raises the exception in QueueFeederThread
msg142896 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2011-08-24 19:01
I can't seem to reproduce this under 3.3. Should it be closed?
msg142898 - (view)	Author: Jesse Noller (jnoller) *	Date: 2011-08-24 19:07
On Wed, Aug 24, 2011 at 3:01 PM, Antoine Pitrou <report@bugs.python.org> wrote: > > Antoine Pitrou <pitrou@free.fr> added the comment: > > I can't seem to reproduce this under 3.3. Should it be closed? I don't think so; it's still applicable to 2.x, and a fix should go into 2.7 ideally. http://bugs.python.org/issue9207 is the source of the issue AFAIR
msg142903 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2011-08-24 19:22
Indeed, 2.7 seems still affected.
msg142907 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2011-08-24 20:02
Ok, I think the reason this doesn't appear in 3.2/3.3 is the fix for . In 2.x (and 3.1) daemon threads can continue executing after the interpreter's internal structures have started being destroyed. The least intrusive solution is to always join the helper thread before shutting down the interpreter. Patch attached.
msg142911 - (view)	Author: Vinay Sajip (vinay.sajip) *	Date: 2011-08-24 20:16
In Antoine's patch, ISTM that the line created_by_this_process = ... could also be deleted, as the patch no longer uses that value and it's not used anywhere later in the method.
msg142919 - (view)	Author: Roundup Robot (python-dev)	Date: 2011-08-24 20:43
New changeset d316315a8781 by Antoine Pitrou in branch '2.7': Issue #4106: Fix occasional exceptions printed out by multiprocessing on interpreter shutdown. http://hg.python.org/cpython/rev/d316315a8781
msg142920 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2011-08-24 20:52
This should hopefully be fixed now. Feel free to reopen if it isn't.
msg151237 - (view)	Author: Yaniv Aknin (Yaniv.Aknin)	Date: 2012-01-14 06:53
Ugh. Not 100% sure it's related, but I've been getting a similar traceback when running pip's test suite (python setup.py test) on OSX 10.6.8 with Python 2.7.2. Traceback (most recent call last): File "/usr/local/Cellar/python/2.7.2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/atexit.py", line 24, in _run_exitfuncs func(targs, *kargs) File "/usr/local/Cellar/python/2.7.2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/util.py", line 284, in _exit_function info('process shutting down') TypeError: 'NoneType' object is not callable Obviously it's not the exact same bug as fixed here, but Googling the traceback led me here and I do think it's the same genre of bug, i.e., multiprocessing's use of forking leads to issues when atexit is called (wasn't sure whether to open it here or #9207). Also, see https://groups.google.com/forum/#!topic/nose-users/fnJ-kAUbYHQ, it seems other users of the nose testsuite ran into this. I'm afraid I won't have time to look much further into this (the reason I'm running pip's testsuite is that I'm already trying to make a contribution to pip...), but I thought it's best to at least mention it somewhere.

History
Date	User	Action	Args
2022-04-11 14:56:40	admin	set	github: 48356
2012-01-14 06:53:29	Yaniv.Aknin	set	nosy: + Yaniv.Akninmessages: +
2011-08-24 20:52:15	pitrou	set	status: open -> closedresolution: fixedmessages: + stage: patch review -> resolved
2011-08-24 20:43:57	python-dev	set	nosy: + python-devmessages: +
2011-08-24 20:16:36	vinay.sajip	set	nosy: + vinay.sajipmessages: +
2011-08-24 20:02:47	pitrou	set	files: + mpqshutdown.patchkeywords: + patchmessages: + stage: patch review
2011-08-24 19:22:54	pitrou	set	resolution: out of date -> (no value)messages: + versions: - Python 3.2, Python 3.3
2011-08-24 19:07:36	jnoller	set	messages: +
2011-08-24 19:01:15	pitrou	set	versions: + Python 3.2, Python 3.3, - Python 2.6nosy: + pitroumessages: + resolution: out of date
2011-05-04 09:51:25	tobami	set	nosy: + tobamimessages: + versions: + Python 2.7
2011-04-14 00:04:28	dpranke	set	nosy: + dpranke
2011-01-19 14:22:07	jnoller	set	nosy:mattheww, jnoller, Thorney, bobbyi, gdbmessages: +
2011-01-18 23:23:07	Thorney	set	files: + test_mult.pyversions: + Python 2.6, - Python 2.7nosy: + Thorneymessages: +
2010-10-10 21:50:47	mattheww	set	nosy: + mattheww
2010-07-08 20:08:30	gdb	set	messages: +
2010-07-08 19:50:52	jnoller	set	messages: +
2010-07-08 19:43:57	gdb	set	messages: +
2010-07-08 19:40:56	jnoller	set	messages: +
2010-07-08 19:28:01	gdb	set	nosy: + gdbmessages: +
2010-05-20 20:30:00	skip.montanaro	set	nosy: - skip.montanaro
2009-05-18 06:33:26	bobbyi	set	nosy: + bobbyi
2009-01-23 15:26:58	jnoller	set	messages: +
2009-01-23 14:47:01	jnoller	set	messages: +
2009-01-08 21:30:23	jnoller	set	assignee: jnollernosy: + jnoller
2008-10-11 12:36:12	skip.montanaro	set	messages: +
2008-10-11 12:28:25	skip.montanaro	set	messages: +
2008-10-11 03:31:32	skip.montanaro	set	messages: +
2008-10-11 03:30:21	skip.montanaro	create