[Python-Dev] GIL removal question (original) (raw)
Aaron Westendorf aaron at agoragames.com
Fri Aug 12 21:17:59 CEST 2011
- Previous message: [Python-Dev] GIL removal question
- Next message: [Python-Dev] GIL removal question
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Even in the Erlang model, the afore-mentioned issues of bus contention put a cap on the number of threads you can run in any given application assuming there's any amount of cross-thread synchronization. I wrote a blog post on this subject with respect to my experience in tuning RabbitMQ on NUMA architectures.
http://blog.agoragames.com/blog/2011/06/24/of-penguins-rabbits-and-buses/
It should be noted that Erlang processes are not the same as OS processes. They are more akin to green threads, scheduled on N number of legit OS threads which are in turn run on C number of cores. The end effect is the same though, as the data is effectively shared across NUMA nodes, which runs into basic physical constraints.
I used to think the GIL was a major bottleneck, and though I'm not fond of it, my recent experience has highlighted that any application which uses shared memory will have significant bus contention when scaling across all cores. The best course of action is shared-nothing MPI style, but in 64bit land, that can mean significant wasted address space.
<http://blog.agoragames.com/blog/2011/06/24/of-penguins-rabbits-and-buses/> -Aaron
On Fri, Aug 12, 2011 at 2:59 PM, Sturla Molden <sturla at molden.no> wrote:
Den 12.08.2011 18:51, skrev Xavier Morel:
* Erlang uses "erlang processes", which are very cheap preempted processes (no shared memory). There have always been tens to thousands to millions of erlang processes per interpreter source contention within the interpreter going back to pre-SMP by setting the number of schedulers per node to 1 can yield increased overall performances)
Technically, one can make threads behave like processes if they don't share memory pages (though they will still share address space). Erlangs use of 'process' instead of 'thread' does not mean an Erlang process has to be implemented as an OS process. With one interpreter per thread, and a malloc that does not let threads share memory pages (one heap per thread), Python could do the same. On Windows, there is an API function called HeapAlloc, which lets us allocate memory form a dedicated heap. The common use case is to prevent threads from sharing memory, thus behaving like light-weight processes (except address space is shared). On Unix, is is more common to use fork() to create new processes instead, as processes are more light-weight than on Windows. Sturla -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20110812/a8fa96c3/attachment.html>
- Previous message: [Python-Dev] GIL removal question
- Next message: [Python-Dev] GIL removal question
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]