Asyncio RLock - reentrant locks for async Python (original) (raw)
November 29, 2022, 9:12am 1
I’d like to re-open the discussion started here https://github.com/python/asyncio/issues/439, which requested reentrant locks to be added to asyncio.
GvR argues that locks in eventloop-based concurrency are less used because the context switching happens more controlled, and I agree. But there are still use-cases where a complex state change in a component involves I/O, requiring async locking (and no-one argued against that).
What I don’t understand is how those observations make a point against RLocks in async. It’s true that RLocks are not a necessity. But that holds for async as well as threaded concurrency. Their just a convenience. You could always refactor your code to not need them (example given in the same issue). In the most extreme case, that refactoring could mean implementing your own RLock, which isn’t hard and only requires 1. a simple mutex/lock 2. a way to identify the current thread/task.
So they’re not necessary, but they do help to reduce the complexity of code and make it more robust, concise, and easier to reason about.
I ended up implementing my own async RLock and it was a somewhat alien experience having to do that because I’m very used to Python being that language that comes with batteries included.
Worth noting that I’m not the only one that felt the need for async RLocks. There’s even a package to ship this bit of code: asyncio-rlock
(imagine the link to pypi here, I was only permitted to post up to 2 links)
asvetlov (Andrew Svetlov) November 29, 2022, 9:46am 2
Any functionality in stdlib has its own maintenance cost.
If a third-party library on PyPI solves your needs – that’s fine, please use it.
When we start feeling the high demand for the library, this situation can be a reason for embedding it into stdlib.
For asyncio-rlock I see very few usages on the github. Maybe the feature request is not very popular?
robsdedude (Robsdedude) November 29, 2022, 12:42pm 3
For asyncio-rlock I see very few usages on the github. Maybe the feature request is not very popular?
Might be. Or maybe people just wrote their own implementation. Or they swallowed the pill of having to refactor their code to work with simple locks.
Looking at Code search results · GitHub you can see 4 libs or so that rolled their own solution. Possibly more that I didn’t find because they could’ve named their class whatever.
Given that it’s not a lot or complex code and there is some (hard to measure) demand, I think it would be a cheap and worthy addition to asyncio
.
To elaborate on my use-case: I had a decently sized, sync code-base that I wanted to port to async Python. The code used threading.RLock
. So I had the option to rethink and refactor the whole thing to only use simple locks or write my own RLock. As said it felt very awkward to have async ship without batteries.
barry-scott (Barry Scott) November 29, 2022, 11:17pm 4
Given the async code is not using threads you do not need a thread type lock.
You can use normal python variables you hold state that you check.
Complex I/O needs one or more state machines I find, not locks.
If you do have threads then you could use the thread locks in your async code. But must use mechanisms that allow communication with the threads that do not block the event loop.
guido (Guido van Rossum) November 30, 2022, 1:39am 5
To anyone who thinks locking in asyncio is simple, check out the git history of asyncio.Semaphore.
ConnorSMaynes (Connor Maynes) November 30, 2022, 2:33pm 6
I have created an asyncio.RLock
implementation internally to avoid adding another external dependency. It is widely used to avoid all the shadow methods (i.e. remove()
and _remove()
, with remove()
being locked because it is public and _remove()
being used internally with one surrounding lock used by the caller) you get when using a regular asyncio.Lock
to avoid deadlocking.
I completely agree with @robsdedude that similar reasons for needing this apply both in thread-land
and task-land
. asyncio
copied nearly every primitive and method interface. Leaving this out of asyncio
seems a bit unintuitive and inconsistent.
barry-scott (Barry Scott) November 30, 2022, 4:50pm 7
I guess i am missing something.
In my mind locks only have meaning between threads.
Why does an event loop want a lock at all?
Surely that will block the event loop defeating the purpose of async?
ConnorSMaynes (Connor Maynes) November 30, 2022, 5:20pm 8
Why does an event loop want a lock at all?
For the same reasons you might want locks in threaded code: to protect critical sections of code which might lead to a corrupted state otherwise.
Here is a (somewhat long) example where an async lock is useful.
In the output, you can see how publishing of the create
message completes after publishing of the delete
message when no lock is used. When a lock is used, the messages are published serially.
import asyncio
# lets say we have a collection of resources and a pub/sub system
# which we want to send all signals related to those resources to
# lets also say that we want all messages (created, deleted, updated, etc.)
# to be sequential for each resource so that, for example a deleted message
# does not arrive to some subscribers before the created message.
# to do that, we use a lock for each resource.
async def publish(msg, delay):
print(f"starting publish: {msg}")
await asyncio.sleep(delay)
print(f"ending publish: {msg}")
async def create_resource(resource_lock):
print("create - wait lock")
await resource_lock.acquire()
print("create - acquire lock")
try:
print("create - resource")
# once we have created the resource, we have to finish publishing
# or the other parts of the system wont know about it
# we have to hold the lock or publishing of the created message
# may interleave with publishing of the deleted message
await publish("create - resource", delay=3)
finally:
resource_lock.release()
print("create - release lock")
async def delete_resource(resource_lock):
# the lock is outside the shield so we can still cancel waiting on it
# if there is contention with the lock. maybe we come back later.
print("delete - wait lock")
await resource_lock.acquire()
print("delete - acquire lock")
try:
print("delete - resource")
await publish("delete - resource", delay=2)
finally:
resource_lock.release()
print("delete - release lock")
class DummyLock:
async def acquire(self):
...
def release(self):
...
async def main():
print("\nwith lock...")
resource_lock = asyncio.Lock()
creator = asyncio.create_task(create_resource(resource_lock))
await asyncio.sleep(0)
deleter = asyncio.create_task(delete_resource(resource_lock))
await asyncio.wait([creator, deleter])
print("\nwith no lock...")
resource_lock = DummyLock()
creator = asyncio.create_task(create_resource(resource_lock))
await asyncio.sleep(0)
deleter = asyncio.create_task(delete_resource(resource_lock))
await asyncio.wait([creator, deleter])
asyncio.run(main())
Output:
with lock...
create - wait lock
create - acquire lock
create - resource
starting publish: create - resource
delete - wait lock
ending publish: create - resource
create - release lock
delete - acquire lock
delete - resource
starting publish: delete - resource
ending publish: delete - resource
delete - release lock
with no lock...
create - wait lock
create - acquire lock
create - resource
starting publish: create - resource
delete - wait lock
delete - acquire lock
delete - resource
starting publish: delete - resource
ending publish: delete - resource
delete - release lock
ending publish: create - resource
create - release lock
Surely that will block the event loop defeating the purpose of async?
All the asyncio
primitives are specially designed using futures to avoid blocking the event loop. When one of them would block, a future is created in the background and the event loop moves on to other tasks and comes back to the task when the future is done (cancelled or result is set).
If you were to use threading
primitives in asyncio
code, that would block the event loop, but that is why asyncio
has its own version of almost every primitive from threading
.
Rosuav (Chris Angelico) November 30, 2022, 6:10pm 9
Notably, they are only important in asyncio code when the critical section includes an await point, in contrast with threaded code where everything is an await point. So that makes them less important, but still of value.
Recursive locks, then, are only important if (a) the critical section must be entered by at most one task at a time; (b) this critical section can be entered by the same task more than once, which isn’t a problem; and (c) there are await points within this critical section. Rare, but definitely possible.
EpicWink (Laurie O) November 30, 2022, 9:26pm 10
If it’s common for users to write their own recursive lock, perhaps it makes sense to include some links to some third-party libraries providing RLocks in the documentation for asyncio.Lock.
JoshuaAlbert (Joshua Albert) June 15, 2023, 12:25am 11
mattyoungberg (Matt Youngberg) January 16, 2025, 6:09pm 12
I wrote up a small class to show how a reentrant lock could be useful, only to see how easy it was to solve my problem. So I’m now empathetic to the argument that a reentrant lock isn’t strictly needed.
However, Python’s big strength is its usability and the “batteries included” approach alluded to in this thread.
In my head, I already knew how to solve my concurrency problem if I only had a reentrant lock. It took me some extra time to figure out how to solve it without an reentrant lock.
That “extra time spent” figuring out how to do this alone would hopefully be a strong enough argument to merit serious consideration of adding a reentrant lock. It’s a common sychronization primitive, even included in the threading
library. There may be a maintainability cost, but if absorbed, it would make asyncio development more accessible in the end to programmers that already have a decent grounding in concurrency.