Issue 3618: possible deadlock in python IO implementation (original) (raw)

Issue3618

Created on 2008-08-20 11:00 by vstinner, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
io_deadlock.patch vstinner,2008-08-21 11:36 Use RLock instead of Lock + unit test for the bug
rlockio.patch pitrou,2008-09-04 14:14
Messages (21)
msg71537 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2008-08-20 11:00
BufferedWriter from Lib/io.py is thread-safe, but... the Python instruction "with self._write_lock:" could be interrupted when the lock is already acquired. Especially, _lsprof.Profiler() uses ceval hook and is called when the lock is acquired. But if the profiler (or any other module using the hook) re-use the stream (stdout or stderr), we get a deadlock!? Well, this problem is really not critical since only developers profilers (really?). Solution: use C implementation of Lib/io.py (from Python 2.6) to avoid ceval hook? Example with py3k trunk: >>> import _lsprof >>> prof=_lsprof.Profiler(42) >>> prof.enable() # _lsprof calls 42() # -> 42 is not callable # -> call PyErr_WriteUnraisable(<42 is not callable>) # -> deadlock Traceback example: #6 0x080ecb30 in PyThread_acquire_lock (lock=0x820f550, waitflag=1) at Python/thread_pthread.h:352 (...) #8 0x08162521 in PyCFunction_Call (func=0x83abfbc, arg=0xb7dc6034, kw=0x0) at Objects/methodobject.c:81 #9 0x080b3659 in call_function (pp_stack=0xbfbf9474, oparg=0) at Python/ceval.c:3403 #10 0x080ae7a6 in PyEval_EvalFrameEx (f=0x83b9f94, throwflag=0) at Python/ceval.c:2205 #11 0x080b3aae in fast_function (func=0x83a25b4, pp_stack=0xbfbf9724, n=2, na=2, nk=0) at Python/ceval.c:3491 #12 0x080b37ef in call_function (pp_stack=0xbfbf9724, oparg=1) at Python/ceval.c:3424 #13 0x080ae7a6 in PyEval_EvalFrameEx (f=0x83b981c, throwflag=0) at Python/ceval.c:2205 #14 0x080b1aee in PyEval_EvalCodeEx (co=0x838a328, globals=0x8330534, locals=0x0, args=0x8373da0, argcount=2, kws=0x0, kwcount=0, defs=0x0, defcount=0, kwdefs=0x0, closure=0x0) at Python/ceval.c:2840 #15 0x0814ab2e in function_call (func=0x83a3b8c, arg=0x8373d8c, kw=0x0) at Objects/funcobject.c:628 #16 0x08118d19 in PyObject_Call (func=0x83a3b8c, arg=0x8373d8c, kw=0x0) at Objects/abstract.c:2181 #17 0x08132eb0 in method_call (func=0x83a3b8c, arg=0x8373d8c, kw=0x0) at Objects/classobject.c:323 #18 0x08118d19 in PyObject_Call (func=0x83037dc, arg=0x83141f4, kw=0x0) at Objects/abstract.c:2181 #19 0x080b2ed8 in PyEval_CallObjectWithKeywords (func=0x83037dc, arg=0x83141f4, kw=0x0) at Python/ceval.c:3283 #20 0x08141779 in PyFile_WriteObject (v=0x8398e28, f=0x83ab0dc, flags=1) at Objects/fileobject.c:164 #21 0x08141974 in PyFile_WriteString (s=0x819a2f2 "Exception ", f=0x83ab0dc) at Objects/fileobject.c:189 #22 0x080c473c in PyErr_WriteUnraisable (obj=0x81fbd78) at Python/errors.c:696 #23 0xb7f9612f in CallExternalTimer (pObj=0x83a7aa8) at /home/haypo/prog/py3k/Modules/_lsprof.c:135 #24 0xb7f96984 in Stop (pObj=0x83a7aa8, self=0x826c6d8, entry=0x826ec80) at /home/haypo/prog/py3k/Modules/_lsprof.c:337 #25 0xb7f96c60 in ptrace_leave_call (self=0x83a7aa8, key=0x81cb150) at /home/haypo/prog/py3k/Modules/_lsprof.c:420 #26 0xb7f96d5c in profiler_callback (self=0x83a7aa8, frame=0x835a0b4, what=6, arg=0x83ab92c) at /home/haypo/prog/py3k/Modules/_lsprof.c:471 #27 0x080b28cb in call_trace (func=0xb7f96c85 <profiler_callback>, obj=0x83a7aa8, frame=0x835a0b4, what=6, arg=0x83ab92c) at Python/ceval.c:3090 #28 0x080b35da in call_function (pp_stack=0xbfbf9d74, oparg=1) at Python/ceval.c:3403 #29 0x080ae7a6 in PyEval_EvalFrameEx (f=0x835a0b4, throwflag=0) at Python/ceval.c:2205 ceval hook: Python/ceval.3403: PCALL(PCALL_CFUNCTION); if (flags & (METH_NOARGS | METH_O)) { ... } else { PyObject *callargs; callargs = load_args(pp_stack, na); READ_TIMESTAMP(*pintr0); C_TRACE(x, PyCFunction_Call(func,callargs,NULL)); <= **here** READ_TIMESTAMP(*pintr1); Py_XDECREF(callargs); }
msg71538 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-08-20 11:56
Yes indeed. We could use an RLock to avoid the problem but RLock's are damn slow since they are written in pure Python (see #3001). Rewriting the critical parts of RLock (constructor, acquire(), release(), __enter__(), __exit__()) in C should not be too complicated, would you want to do it? :) > Solution: use C implementation of Lib/io.py (from Python 2.6) to avoid > ceval hook? I guess it's out of question. However, Buffered{Reader,Writer,Random} should be rewritten in C one day, it is necessary for speed. Then the problem will vanish since a lock will only need to be taken when releasing the GIL, that is not when Python code is being interpreted.
msg71541 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2008-08-20 13:51
@pitrou: I wrote an implementation of RLock in C (see #3001). So it would be possible to use threading.RLock instead of threading.Lock ;-)
msg71543 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2008-08-20 14:04
Oops, I forgot to update PyInit__Thread() with my new time: - Add PyType_Ready() - Register RLockType to threading dict Here is the new patch
msg71545 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2008-08-20 14:06
Ooops again, I uploaded my patch to the wrong issue! The new patch is now in the issue #3001.
msg71607 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2008-08-21 05:15
see the python-3000 thread I just started asking for opinions on this. I'd personally say this bug is a good reason to go ahead with #3001 for 3.0.
msg71625 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2008-08-21 11:36
So if we consider that RLock is fast enough (see my C version of RLokc in #3001), we can use RLock instead of Lock to avoid this issue. Here is a patch to use RLock and also including an unit test of this issue.
msg71626 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-08-21 12:20
Selon STINNER Victor <report@bugs.python.org>: > > So if we consider that RLock is fast enough (see my C version of RLokc > in #3001), we can use RLock instead of Lock to avoid this issue. Here > is a patch to use RLock and also including an unit test of this issue. I tried your test and it seems to lock when using the Python implementation of RLock, but perhaps it's precisely because the implementation is in Python. Also, your RLock implementation has to be finished before going further ;) (see my comments in #3001)
msg72496 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-09-04 12:17
We might as well bite the bullet and include a short, minimalist RLock implementation in io.py (so as not to pull threading and all its dependencies at startup). The C version of RLock will wait for 3.1.
msg72507 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-09-04 14:14
Here is a patch. The RLock implementation is naive and as simple as possible. It doesn't solve Haypo's case, probably because the tracing func kicks in in the RLock code itself. I don't want to make a decision on this alone, so someone please advise.
msg72510 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-09-04 15:30
(I have to add that the patch makes small reads about 60-80% slower)
msg78099 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2008-12-20 13:22
I hope that this issue will be fixed by io-c (io library rewritten in C language).
msg80789 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2009-01-29 22:43
The test (io_deadlock.patch) pass on the io-c branch \o/
msg82924 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2009-02-28 17:15
Yes, this is solved in the io-c branch. Antoine, do you think we should switch _pyio to use the RLock?
msg82933 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-02-28 18:12
I don't know. The RLock is a lot slower than the normal non-recursive variation, on the other hand I'm not sure we care about performance of the Python version that much. Opinions welcome.
msg82934 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2009-02-28 18:19
2009/2/28 Antoine Pitrou <report@bugs.python.org>: > > Antoine Pitrou <pitrou@free.fr> added the comment: > > I don't know. The RLock is a lot slower than the normal non-recursive > variation, on the other hand I'm not sure we care about performance of > the Python version that much. Opinions welcome. I'm +0. The deadlock will only affect people specifically messing with the Python io implementation. It's low priority, so maybe we should just do it when RLock is rewritten in C.
msg83136 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2009-03-04 21:30
Since io-c has been merging, I'm lowering priority.
msg83169 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2009-03-05 00:28
> Since io-c has been merging, I'm lowering priority. Why not fixing this issue? The issue is rare (only occurs when using profiling with a callback writting to stdout) and you closed the issue #4862 which is more common (read an UTF-16 file).
msg83171 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2009-03-05 00:30
2009/3/4 STINNER Victor <report@bugs.python.org>: > > STINNER Victor <victor.stinner@haypocalc.com> added the comment: > >> Since io-c has been merging, I'm lowering priority. > > Why not fixing this issue? The issue is rare (only occurs when using profiling > with a callback writting to stdout) and you closed the issue #4862 which is > more common (read an UTF-16 file). See .
msg83175 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2009-03-05 01:00
Ooops, I wanted to write: "Why not *closing* this issue?", sorry.
msg83177 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2009-03-05 01:02
I suppose we might as well. If anyone wants to fix the Python implementation later, they can go ahead and reopen this.
History
Date User Action Args
2022-04-11 14:56:37 admin set github: 47868
2009-03-05 01:02:19 benjamin.peterson set status: open -> closedresolution: wont fixmessages: +
2009-03-05 01:00:03 vstinner set messages: +
2009-03-05 00:30:47 benjamin.peterson set messages: +
2009-03-05 00:28:24 vstinner set messages: +
2009-03-04 21:30:46 benjamin.peterson unlink issue4565 dependencies
2009-03-04 21:30:36 benjamin.peterson set priority: critical -> lowmessages: + title: possible deadlock in IO library (Lib/io.py) -> possible deadlock in python IO implementation
2009-02-28 18:19:02 benjamin.peterson set messages: +
2009-02-28 18:12:48 pitrou set messages: +
2009-02-28 17:15:05 benjamin.peterson set nosy: + benjamin.petersonmessages: +
2009-02-28 17:14:36 benjamin.peterson link issue4565 dependencies
2009-01-29 22:43:36 vstinner set messages: +
2008-12-27 06:02:06 ggenellina set nosy: + ggenellina
2008-12-26 23:17:34 pitrou link issue4716 superseder
2008-12-20 13:22:23 vstinner set messages: +
2008-09-04 15:30:03 pitrou set messages: +
2008-09-04 14:16:46 pitrou set keywords: + needs review
2008-09-04 14:14:27 pitrou set files: + rlockio.patchmessages: +
2008-09-04 12:17:58 pitrou set messages: +
2008-09-04 01:16:29 benjamin.peterson set priority: release blocker -> critical
2008-09-03 18:20:12 jcea set nosy: + jcea
2008-08-21 12:20:04 pitrou set messages: +
2008-08-21 11:36:27 vstinner set files: + io_deadlock.patchmessages: +
2008-08-21 05:15:25 gregory.p.smith set messages: +
2008-08-21 04:47:32 gregory.p.smith set nosy: + gregory.p.smith
2008-08-21 03:07:41 barry set priority: deferred blocker -> release blocker
2008-08-20 22:28:17 barry set priority: high -> deferred blocker
2008-08-20 14:06:50 vstinner set messages: +
2008-08-20 14:06:20 vstinner set files: - rlock.patch
2008-08-20 14:04:33 vstinner set files: + rlock.patchkeywords: + patchmessages: +
2008-08-20 13:51:47 vstinner set messages: +
2008-08-20 11:56:37 pitrou set priority: highnosy: + pitroumessages: +
2008-08-20 11:00:50 vstinner create