[Python-Dev] Making the GIL faster & lighter on Windows (original) (raw)

Phillip Sitbon phillip.sitbon+python-dev at gmail.com
Tue May 26 21:48:49 CEST 2009


Hi everyone,

I'm new to the list but I've been embedding Python and working very closely with the core sources for many years now. I discovered Python a long time ago when I needed to embed a scripting language and found the PHP sources... unreadable ;)

Anyway, I'd like to ask something that may have been asked already, so I apologize if this has been covered.

Instead of removing the GIL, has anyone thought of making it more lightweight? The current situation for Windows is that the single-thread case is decently fast (via interlocked operations), but it drops to using an event object in the case of contention. (see thread_nt.h)

Now, I don't have any specific evidence aside from my experience in Windows multithreaded programming, but event objects are often considered the slowest synchronization mechanism available. So, what are the alternatives? Mutexes or critical sections. Semaphores too, if you want to get fancy, but I digress.

Because mutexes have the capability of inter-process locking, which we don't need, critical sections fit the bill as a lightweight locking mechanism. They work in a way similar to how the Python GIL is handled: first, attempt an interlocked operation, and if another thread owns the lock, wait on a kernel object. They are known to be extremely fast.

There are some catches with using a critical section instead of the current method:

  1. It is recursive, while the current GIL setup is not. Would it break Python to support (or deal with) recursive behavior at the GIL level? Note that we can still disallow recursion and fail because we know if the current thread is the lock owner, but the return from the lock function is usually only checked when the wait parameter is zero (meaning "don't block, just try to acquire"). The biggest problem I see here is how mixing the PyGILState_* API with multiple interpreters will behave: when PyGILState_Ensure() is called while the GIL is held for a thread state under an interpreter other than the main interpreter, it tries to re-lock the GIL. This would normally cause a deadlock, but the best we could do with a critical section is have the call fail and/or increase a recursion counter. If maintaining behavior is absolutely necessary, I guess it would be pretty easy to force a deadlock. Personally, I would prefer a Py_FatalError or something like it.

  2. Backwards incompatibility: TryEnterCriticalSection isn't available pre-NT4, so Windows 95 support is broken. Microsoft doesn't support or even mention it in the list of supporting OSes for their API functions anymore, so... non-issue? Some of the data structure is available to us, so I bet it would be easy to implement the function manually.

  3. ?? - I'm sure there are other issues that deserve a look.

I've given this a shot already while doing some concurrency testing with my ISAPI extension (PyISAPIe). First of all, nothing looks broken yet. I'm using my modified python26.dll to run all of my Python code and trying to find anywhere it could possibly break. For multiple concurrent requests against a single multithreaded ISAPI handler process, I see a statistically significant speed increase depending on how much Python code is executed. With more Python code executed (e.g. a Django page), the speedup was about 2x. I haven't tested with varied values for _Py_CheckInterval aside from finding a sweet spot for my specific purposes, but using 100 (the default) would likely make the performance difference more noticeable. A spin mutex also does well, but the results vary a lot more.

Just as a disclaimer, my tests were nowhere near scientific, but if anyone needs convincing I can come up with some actual measurements. I think at this point most of you are wondering more about what it would break.

Hopefully I haven't wasted anyone's time - I just wanted to share what I see as a possibly substantial improvement to Python's core. let me know if you're interested in a patch to use for your own testing.

Cheers,

Phillip



More information about the Python-Dev mailing list