Issue 3001: RLock's are SLOW (original) (raw)

Created on 2008-05-29 15:28 by jcea, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
rlock-v2.patch vstinner,2008-08-20 14:06 Fixed C implementation of RLock
rlock2.patch pitrou,2009-11-07 01:07
rlock3.patch pitrou,2009-11-07 17:47
rlock4.patch pitrou,2009-11-07 20:12
Messages (20)
msg67497 - (view) Author: Jesús Cea Avión (jcea) * (Python committer) Date: 2008-05-29 15:28
threading.RLock acquire/release is very slow. A order of magnitude higher than no reentrant threading.Lock: """ def RLockSpeed() : import time, threading t=time.time() result={} for i in xrange(1000000) : pass result["empty loop"]=time.time()-t l=threading.Lock() t=time.time() for i in xrange(1000000) : l.acquire() l.release() result["Lock"]=time.time()-t l=threading.RLock() t=time.time() for i in xrange(1000000) : l.acquire() l.release() result["RLock"]=time.time()-t return result if __name__=="__main__" : print RLockSpeed() """ Result: {'empty loop': 0.074212074279785156, 'RLock': 10.144084215164185, 'Lock': 1.2489800453186035}
msg67703 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-06-04 21:33
You should investigate and try to diagnose where the speed difference comes from. ISTM the RLock class is implemented in Python while the Lock class is simply an alias to the builtin native lock type, which could explain most of the discrepancy.
msg68512 - (view) Author: sebastian serrano (sserrano) Date: 2008-06-21 16:40
Running with python -O the timing gets a little closer between Lock and RLock. This code won't be easy to improve in performance. The heaviest call is current_thread(), used at lines: 117: me = current_thread() 137: if self.__owner is not current_thread(): and only consist on: 788: def current_thread(): 789: try: 790: return _active[_get_ident()] 791: except KeyError: 792: ##print "current_thread(): no current thread for", _get_ident() 793: return _DummyThread() Simple profiler dump: $../python-trunk/python -O rlock.py time Lock 0.720541000366 time RLock 4.90231084824 400004 function calls in 0.982 CPU seconds Ordered by: internal time, call count ncalls tottime percall cumtime percall filename:lineno(function) 100000 0.304 0.000 0.390 0.000 threading.py:116(acquire) 100000 0.278 0.000 0.360 0.000 threading.py:136(release) 1 0.232 0.232 0.982 0.982 rlock.py:27(testRLock) 200000 0.168 0.000 0.168 0.000 threading.py:788(current_thread) 1 0.000 0.000 0.000 0.000 threading.py:103(__init__) 1 0.000 0.000 0.000 0.000 threading.py:98(RLock) 1 0.000 0.000 0.000 0.000 threading.py:76(__init__) 0 0.000 0.000 profile:0(profiler)
msg68536 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-06-21 19:22
Le samedi 21 juin 2008 à 16:40 +0000, sebastian serrano a écrit : > sebastian serrano <sebastian@devsar.com> added the comment: > > Running with python -O the timing gets a little closer between Lock and > RLock. This code won't be easy to improve in performance. > The heaviest call is current_thread(), used at lines: > 117: me = current_thread() > 137: if self.__owner is not current_thread(): One could always try to rewrite RLock by replacing calls to threading.current_thread() with thread.get_ident(). However, given the profile table you have appended, it will only save at most 30% of the time. If someone needs a more important speed-up, he should reimplement the RLock type in C (and contribute it back :-)).
msg71540 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2008-08-20 13:49
As suggested by pitrou, I wrote an implementation of RLock in C. Changes to Python version: - no debug message: i leave the message in #if 0 ... #endif - acquire() method argument "blocking" is not a keyword Notes: - RLock has no docstring! - In the Python version, RLock._release_save() replaces owner and counter attributes before release the lock. But releasing the lock may fails and no the object is in an inconsistent state
msg71544 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2008-08-20 14:06
Oops, I forgot to update PyInit__Thread() with my new time: - Add PyType_Ready() - Register RLockType to threading dict Here is the new patch.
msg71546 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-08-20 14:17
Wow, that was quick. Did you try to replace threading.RLock with your implementation, and run the tests? By the way: > - acquire() method argument "blocking" is not a keyword > - In the Python version, RLock._release_save() replaces owner and > counter attributes before release the lock. But releasing the lock may > fails and no the object is in an inconsistent state Removing the debugging statements is fine, but apart from that the C implementation should mimick the current behaviour. Even if this behaviour has potential pitfalls. Speaking of which, it would be nice if RLock was subclassable. I don't know whether any software relies on this though.
msg71564 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2008-08-20 18:28
I doubt subclassability of RLock matters but who knows, people do code things. Regardless, using a C version wrapped in a simple python container class that calls the underlying C implementation's methods should be sufficient to allow sub-classing. Given the final 2.6 beta is scheduled for today, this won't make it into 2.6/3.0 so we've got some time to polish up what we want.
msg71568 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-08-20 19:00
Gregory, would you have an advice on #3618?
msg74555 - (view) Author: Hugh Gibson (hgibson50) Date: 2008-10-09 05:39
> I doubt subclassability of RLock matters but who knows, people do code > things. I've recently done this to implement potential deadlock detection. I keep a record of the sequences of acquired locks, find unique sequences, then check for conflicts between each sequence. There's not much overhead and it highlighted some potential deadlocks where lock A and B were acquired AB in one route through code and BA in another route. The algorithm is a simplified version of that used in Linux - see http://www.mjmwired.net/kernel/Documentation/lockdep-design.txt Hugh
msg95007 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-11-07 01:07
Here is a new patch against py3k. It passes all tests and is generally 10-15x faster than the pure Python version.
msg95012 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2009-11-07 07:48
Reviewers: , http://codereview.appspot.com/150055/diff/1/4 File Modules/_threadmodule.c (right): http://codereview.appspot.com/150055/diff/1/4#newcode221 Modules/_threadmodule.c:221: return PyBool_FromLong((long) r); This explicit (long) cast is unnecessary. http://codereview.appspot.com/150055/diff/1/4#newcode246 Modules/_threadmodule.c:246: PyThread_release_lock(self->rlock_lock); reset self->rlock_owner to 0 before releasing the lock. Description: code review for http://bugs.python.org/issue3001 Please review this at http://codereview.appspot.com/150055 Affected files: M Lib/test/test_threading.py M Lib/threading.py M Modules/_threadmodule.c
msg95014 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2009-11-07 08:57
rlock_acquire_doc: "(...) return None once the lock is acquired". That's wrong, acquire() always return a boolean (True or False). rlock_release(): Is the assert(self->rlock_count > 0); really required? You're checking its value few lines before. rlock_acquire_restore(): raise an error if the lock acquire failed, whereas rlock_acquire() doesn't. Why not raising an error in rlock_acquire() (especially if blocking is True)? Or if the error can never occur, remove the error checking in rlock_acquire_restore(). rlock_acquire_restore(): (maybe) set owner to 0 if count is 0. rlock_release_save(): may also reset self->rlock_owner to 0, as rlock_release(). rlock_new(): why not reusing type instead of Py_TYPE(self) in "Py_TYPE(self)->tp_free(self)"? __exit__: rlock_release() is defined as __exit__() with METH_VARARGS, but it has no argument. Can it be a problem? Anything, thanks for the faster RLock! If your patch is commited, you can reconsider #3618 (possible deadlock in python IO implementation).
msg95018 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-11-07 15:06
Thanks for the review. I will make the suggested modifications. http://codereview.appspot.com/150055/diff/1/4 File Modules/_threadmodule.c (right): http://codereview.appspot.com/150055/diff/1/4#newcode221 Modules/_threadmodule.c:221: return PyBool_FromLong((long) r); On 2009/11/07 07:48:05, gregory.p.smith wrote: > This explicit (long) cast is unnecessary. Right. http://codereview.appspot.com/150055/diff/1/4#newcode246 Modules/_threadmodule.c:246: PyThread_release_lock(self->rlock_lock); On 2009/11/07 07:48:05, gregory.p.smith wrote: > reset self->rlock_owner to 0 before releasing the lock. We always check rlock_count before rlock_owner anyway but, yes, I could reset rlock_owner out of safety. http://codereview.appspot.com/150055
msg95019 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-11-07 15:17
> rlock_acquire_doc: "(...) return None once the lock is acquired". > That's wrong, acquire() always return a boolean (True or False). You're right, I should fix that docstring. > rlock_release(): Is the assert(self->rlock_count > 0); really required? > You're checking its value few lines before. Right again :) I forgot this was unsigned. > rlock_acquire_restore(): raise an error if the lock acquire failed, > whereas rlock_acquire() doesn't. Why not raising an error in > rlock_acquire() (especially if blocking is True)? For rlock_acquire(), I mimicked what the Python code (_PyRLock.acquire) does: if locking fails, it returns False instead. It is part of the API. (and I agree this is not necessarily right, because failing to lock if blocking is True would probably indicate a low-level system error, but the purpose of the patch is not to change the API) But you're right that the Python version of rlock_acquire_restore() doesn't check the return code either, so I may remove this check from the C code, although the rest of the method clearly assumes the lock has been taken. > rlock_acquire_restore(): (maybe) set owner to 0 if count is 0. > > rlock_release_save(): may also reset self->rlock_owner to 0, as > rlock_release(). As I said to Gregory, the current code doesn't rely on rlock_owner to be reset but, yes, we could still add it out of safety. > rlock_new(): why not reusing type instead of Py_TYPE(self) in > "Py_TYPE(self)->tp_free(self)"? Good point. > __exit__: rlock_release() is defined as __exit__() with METH_VARARGS, > but it has no argument. Can it be a problem? I just mimicked the corresponding declarations in the non-recursive lock object. Apparently it's safe on all architectures Python has been running on for years, although I agree it looks strange. > If your patch is commited, you can reconsider #3618 (possible deadlock > in python IO implementation). Indeed. Thanks !
msg95022 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-11-07 17:47
Here is an updated patch. I addressed all review comments, except the one about acquire_restore() checking the return result of acquire(), because I think it's really a weakness in the Python implementation.
msg95023 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2009-11-07 19:44
Can you make the C implementation's repr() show something similar to the Python implementation?
msg95024 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-11-07 20:12
Yes, here is a new patch adding tp_repr.
msg95042 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2009-11-08 14:10
rlock4.patch looks correct and pass test_threading.py tests.
msg95127 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-11-10 18:52
I've committed the latest patch in r76189. Thanks for the reviews, everyone.
History
Date User Action Args
2022-04-11 14:56:35 admin set github: 47251
2009-11-10 18:52:30 pitrou set status: open -> closedresolution: fixedmessages: +
2009-11-08 14:10:34 vstinner set messages: +
2009-11-07 20:12:18 pitrou set files: + rlock4.patchmessages: +
2009-11-07 19:44:51 gregory.p.smith set messages: +
2009-11-07 17:47:29 pitrou set files: + rlock3.patchmessages: +
2009-11-07 15:17:14 pitrou set messages: +
2009-11-07 15:06:57 pitrou set messages: +
2009-11-07 08:57:22 vstinner set messages: +
2009-11-07 07:48:07 gregory.p.smith set messages: +
2009-11-07 01:13:01 pitrou set stage: needs patch -> patch review
2009-11-07 01:07:52 pitrou set files: + rlock2.patchmessages: +
2009-11-06 00:57:42 pitrou set assignee: pitrouversions: + Python 3.2, - Python 3.1, Python 2.7nosy:gregory.p.smith, jcea, Rhamphoryncus, pitrou, hgibson50, vstinner, giampaolo.rodola, kevinwatters, sserranocomponents: - Interpreter Corestage: patch review -> needs patch
2009-05-16 19:35:27 ajaksu2 set stage: patch review
2008-12-02 18:02:37 kevinwatters set nosy: + kevinwatters
2008-10-09 05:39:10 hgibson50 set nosy: + hgibson50messages: +
2008-09-22 17:08:53 vstinner set files: - rlock.patch
2008-08-20 19:00:35 pitrou set messages: + versions: - Python 2.6, Python 3.0
2008-08-20 18:28:38 gregory.p.smith set messages: + versions: + Python 3.1, Python 2.7
2008-08-20 14:17:05 pitrou set messages: +
2008-08-20 14:06:15 vstinner set files: + rlock-v2.patchmessages: +
2008-08-20 13:49:40 vstinner set files: + rlock.patchnosy: + vstinnermessages: + keywords: + patch
2008-07-07 05:01:53 gregory.p.smith set priority: normalnosy: + gregory.p.smithcomponents: + Interpreter Core, Library (Lib)
2008-06-21 19:22:15 pitrou set messages: +
2008-06-21 16:40:52 sserrano set nosy: + sserranomessages: +
2008-06-04 21:33:34 pitrou set nosy: + pitroumessages: +
2008-06-03 22:54:11 giampaolo.rodola set nosy: + giampaolo.rodola
2008-05-29 20:06:20 Rhamphoryncus set nosy: + Rhamphoryncus
2008-05-29 15:28:10 jcea create