msg71537 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2008-08-20 11:00 |
BufferedWriter from Lib/io.py is thread-safe, but... the Python instruction "with self._write_lock:" could be interrupted when the lock is already acquired. Especially, _lsprof.Profiler() uses ceval hook and is called when the lock is acquired. But if the profiler (or any other module using the hook) re-use the stream (stdout or stderr), we get a deadlock!? Well, this problem is really not critical since only developers profilers (really?). Solution: use C implementation of Lib/io.py (from Python 2.6) to avoid ceval hook? Example with py3k trunk: >>> import _lsprof >>> prof=_lsprof.Profiler(42) >>> prof.enable() # _lsprof calls 42() # -> 42 is not callable # -> call PyErr_WriteUnraisable(<42 is not callable>) # -> deadlock Traceback example: #6 0x080ecb30 in PyThread_acquire_lock (lock=0x820f550, waitflag=1) at Python/thread_pthread.h:352 (...) #8 0x08162521 in PyCFunction_Call (func=0x83abfbc, arg=0xb7dc6034, kw=0x0) at Objects/methodobject.c:81 #9 0x080b3659 in call_function (pp_stack=0xbfbf9474, oparg=0) at Python/ceval.c:3403 #10 0x080ae7a6 in PyEval_EvalFrameEx (f=0x83b9f94, throwflag=0) at Python/ceval.c:2205 #11 0x080b3aae in fast_function (func=0x83a25b4, pp_stack=0xbfbf9724, n=2, na=2, nk=0) at Python/ceval.c:3491 #12 0x080b37ef in call_function (pp_stack=0xbfbf9724, oparg=1) at Python/ceval.c:3424 #13 0x080ae7a6 in PyEval_EvalFrameEx (f=0x83b981c, throwflag=0) at Python/ceval.c:2205 #14 0x080b1aee in PyEval_EvalCodeEx (co=0x838a328, globals=0x8330534, locals=0x0, args=0x8373da0, argcount=2, kws=0x0, kwcount=0, defs=0x0, defcount=0, kwdefs=0x0, closure=0x0) at Python/ceval.c:2840 #15 0x0814ab2e in function_call (func=0x83a3b8c, arg=0x8373d8c, kw=0x0) at Objects/funcobject.c:628 #16 0x08118d19 in PyObject_Call (func=0x83a3b8c, arg=0x8373d8c, kw=0x0) at Objects/abstract.c:2181 #17 0x08132eb0 in method_call (func=0x83a3b8c, arg=0x8373d8c, kw=0x0) at Objects/classobject.c:323 #18 0x08118d19 in PyObject_Call (func=0x83037dc, arg=0x83141f4, kw=0x0) at Objects/abstract.c:2181 #19 0x080b2ed8 in PyEval_CallObjectWithKeywords (func=0x83037dc, arg=0x83141f4, kw=0x0) at Python/ceval.c:3283 #20 0x08141779 in PyFile_WriteObject (v=0x8398e28, f=0x83ab0dc, flags=1) at Objects/fileobject.c:164 #21 0x08141974 in PyFile_WriteString (s=0x819a2f2 "Exception ", f=0x83ab0dc) at Objects/fileobject.c:189 #22 0x080c473c in PyErr_WriteUnraisable (obj=0x81fbd78) at Python/errors.c:696 #23 0xb7f9612f in CallExternalTimer (pObj=0x83a7aa8) at /home/haypo/prog/py3k/Modules/_lsprof.c:135 #24 0xb7f96984 in Stop (pObj=0x83a7aa8, self=0x826c6d8, entry=0x826ec80) at /home/haypo/prog/py3k/Modules/_lsprof.c:337 #25 0xb7f96c60 in ptrace_leave_call (self=0x83a7aa8, key=0x81cb150) at /home/haypo/prog/py3k/Modules/_lsprof.c:420 #26 0xb7f96d5c in profiler_callback (self=0x83a7aa8, frame=0x835a0b4, what=6, arg=0x83ab92c) at /home/haypo/prog/py3k/Modules/_lsprof.c:471 #27 0x080b28cb in call_trace (func=0xb7f96c85 <profiler_callback>, obj=0x83a7aa8, frame=0x835a0b4, what=6, arg=0x83ab92c) at Python/ceval.c:3090 #28 0x080b35da in call_function (pp_stack=0xbfbf9d74, oparg=1) at Python/ceval.c:3403 #29 0x080ae7a6 in PyEval_EvalFrameEx (f=0x835a0b4, throwflag=0) at Python/ceval.c:2205 ceval hook: Python/ceval.3403: PCALL(PCALL_CFUNCTION); if (flags & (METH_NOARGS | METH_O)) { ... } else { PyObject *callargs; callargs = load_args(pp_stack, na); READ_TIMESTAMP(*pintr0); C_TRACE(x, PyCFunction_Call(func,callargs,NULL)); <= **here** READ_TIMESTAMP(*pintr1); Py_XDECREF(callargs); } |
|
|
msg71538 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2008-08-20 11:56 |
Yes indeed. We could use an RLock to avoid the problem but RLock's are damn slow since they are written in pure Python (see #3001). Rewriting the critical parts of RLock (constructor, acquire(), release(), __enter__(), __exit__()) in C should not be too complicated, would you want to do it? :) > Solution: use C implementation of Lib/io.py (from Python 2.6) to avoid > ceval hook? I guess it's out of question. However, Buffered{Reader,Writer,Random} should be rewritten in C one day, it is necessary for speed. Then the problem will vanish since a lock will only need to be taken when releasing the GIL, that is not when Python code is being interpreted. |
|
|
msg71541 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2008-08-20 13:51 |
@pitrou: I wrote an implementation of RLock in C (see #3001). So it would be possible to use threading.RLock instead of threading.Lock ;-) |
|
|
msg71543 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2008-08-20 14:04 |
Oops, I forgot to update PyInit__Thread() with my new time: - Add PyType_Ready() - Register RLockType to threading dict Here is the new patch |
|
|
msg71545 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2008-08-20 14:06 |
Ooops again, I uploaded my patch to the wrong issue! The new patch is now in the issue #3001. |
|
|
msg71607 - (view) |
Author: Gregory P. Smith (gregory.p.smith) *  |
Date: 2008-08-21 05:15 |
see the python-3000 thread I just started asking for opinions on this. I'd personally say this bug is a good reason to go ahead with #3001 for 3.0. |
|
|
msg71625 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2008-08-21 11:36 |
So if we consider that RLock is fast enough (see my C version of RLokc in #3001), we can use RLock instead of Lock to avoid this issue. Here is a patch to use RLock and also including an unit test of this issue. |
|
|
msg71626 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2008-08-21 12:20 |
Selon STINNER Victor <report@bugs.python.org>: > > So if we consider that RLock is fast enough (see my C version of RLokc > in #3001), we can use RLock instead of Lock to avoid this issue. Here > is a patch to use RLock and also including an unit test of this issue. I tried your test and it seems to lock when using the Python implementation of RLock, but perhaps it's precisely because the implementation is in Python. Also, your RLock implementation has to be finished before going further ;) (see my comments in #3001) |
|
|
msg72496 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2008-09-04 12:17 |
We might as well bite the bullet and include a short, minimalist RLock implementation in io.py (so as not to pull threading and all its dependencies at startup). The C version of RLock will wait for 3.1. |
|
|
msg72507 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2008-09-04 14:14 |
Here is a patch. The RLock implementation is naive and as simple as possible. It doesn't solve Haypo's case, probably because the tracing func kicks in in the RLock code itself. I don't want to make a decision on this alone, so someone please advise. |
|
|
msg72510 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2008-09-04 15:30 |
(I have to add that the patch makes small reads about 60-80% slower) |
|
|
msg78099 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2008-12-20 13:22 |
I hope that this issue will be fixed by io-c (io library rewritten in C language). |
|
|
msg80789 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2009-01-29 22:43 |
The test (io_deadlock.patch) pass on the io-c branch \o/ |
|
|
msg82924 - (view) |
Author: Benjamin Peterson (benjamin.peterson) *  |
Date: 2009-02-28 17:15 |
Yes, this is solved in the io-c branch. Antoine, do you think we should switch _pyio to use the RLock? |
|
|
msg82933 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2009-02-28 18:12 |
I don't know. The RLock is a lot slower than the normal non-recursive variation, on the other hand I'm not sure we care about performance of the Python version that much. Opinions welcome. |
|
|
msg82934 - (view) |
Author: Benjamin Peterson (benjamin.peterson) *  |
Date: 2009-02-28 18:19 |
2009/2/28 Antoine Pitrou <report@bugs.python.org>: > > Antoine Pitrou <pitrou@free.fr> added the comment: > > I don't know. The RLock is a lot slower than the normal non-recursive > variation, on the other hand I'm not sure we care about performance of > the Python version that much. Opinions welcome. I'm +0. The deadlock will only affect people specifically messing with the Python io implementation. It's low priority, so maybe we should just do it when RLock is rewritten in C. |
|
|
msg83136 - (view) |
Author: Benjamin Peterson (benjamin.peterson) *  |
Date: 2009-03-04 21:30 |
Since io-c has been merging, I'm lowering priority. |
|
|
msg83169 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2009-03-05 00:28 |
> Since io-c has been merging, I'm lowering priority. Why not fixing this issue? The issue is rare (only occurs when using profiling with a callback writting to stdout) and you closed the issue #4862 which is more common (read an UTF-16 file). |
|
|
msg83171 - (view) |
Author: Benjamin Peterson (benjamin.peterson) *  |
Date: 2009-03-05 00:30 |
2009/3/4 STINNER Victor <report@bugs.python.org>: > > STINNER Victor <victor.stinner@haypocalc.com> added the comment: > >> Since io-c has been merging, I'm lowering priority. > > Why not fixing this issue? The issue is rare (only occurs when using profiling > with a callback writting to stdout) and you closed the issue #4862 which is > more common (read an UTF-16 file). See . |
|
|
msg83175 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2009-03-05 01:00 |
Ooops, I wanted to write: "Why not *closing* this issue?", sorry. |
|
|
msg83177 - (view) |
Author: Benjamin Peterson (benjamin.peterson) *  |
Date: 2009-03-05 01:02 |
I suppose we might as well. If anyone wants to fix the Python implementation later, they can go ahead and reopen this. |
|
|