Should our locks (and semaphores, queues, etc.) be fair? (original) (raw)

Currently, trio._sync and trio._core._parking_lot go to some effort to make our synchronization primitives fair, in the sense that e.g. when a lock is released it always goes to the next task who requested it in FIFO order. This is attractive because it avoids starvation issues where one task just can't get a lock. The most obvious example would be when you have two tasks running a loop like

while True: async with some_lock: ...

where they keep releasing the lock and then immediately reaquiring it. Currently, the two tasks will alternate round-robin. If we allow barging, then in the naive implementation the thread that just released the lock will deterministically succeed in re-acquiring it, so the lock will never change hands. Of course this isn't the most representative example, but it illustrates the general idea.

Nonetheless, the conventional wisdom is that fairness is actually bad (!!):

Why Windows made their locks unfair: http://joeduffyblog.com/2006/12/14/anticonvoy-locks-in-windows-server-2003-sp1-and-windows-vista/
This issue on the golang tracker has a great discussion of both why they made their locks unfair (with reference to Doug Lea's experiments with java) and also why that has really bitten them sometimes: runtime: fall back to fair locks after repeated sleep-acquire failures golang/go#13086
- Patch that reduced the worst case unfairness of go locks: https://go-review.googlesource.com/#/c/34310/ (the rust parking_lot package uses a similar strategy)
The webkit locking post, which has an extensive discussion of why they allow barging: https://webkit.org/blog/6161/locking-in-webkit/

It's not clear to me whether the conventional arguments apply here, though. AFAICT the biggest reason why fairness is bad is that it prevents the very-common situation where locks are acquired and then released within a single scheduling quantum, without contention. In this case barging is great; in a uniprocessor system where a lock can always be acquired/release without pre-empting in between, then barging entirely eliminates contention. Wow! And these kinds of micro-critical-sections that just protect a few quick operations are very common in traditional preemptive code. But... for us, micro-critical-sections often don't need locks at all! So this case that may dominate the traditional analysis basically doesn't exist for us. Hmm. Also, our context switching has very different characteristics than theirs, in terms of relative costs of different operations. I don't feel like I have a very clear sense of how this plays out in practice, though -- pypy tracing might change things a lot, it might matter a lot whether the scheduler gets smart enough to start eliding yield_brieflys (and if so, what policy it uses to do so), etc. And another important consideration is that we can potentially have deep integration between our locking primitives and our scheduler, which might let as avoid some of the specific pathologies that often hit user-level locking primitives trying to interact with a kernel scheduler. (In particular, this comment talks about how we might be able to define fairness such that the task that we hand off to is always the next task to be run.)

The answer here may also vary for different primitives. In particular, fairness for locks and fairness for queues seems rather different. I guess for queues fairness is useful if you're multiplexing a set of inputs with backpressure into put, or using get to multiplex a set of outputs into different channels, because it ends up being fair across those input/output channels?