[Python-3000] Kill GIL? (original) (raw)

Ivan Krstić krstic at solarsail.hcs.harvard.edu
Mon Sep 18 09:38:38 CEST 2006

Previous message: [Python-3000] Kill GIL?
Next message: [Python-3000] Kill GIL?
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Bob Ippolito wrote:

Candygram is heavyweight by trade-off, not because it has to be. Candygram could absolutely be implemented efficiently in current Python if a Twisted-like style was used.

Specifically?

* Bite the bullet; write and support a stdlib SHM primitive that works [..] a lightweight fork-and-coordinate wrapper provided in the stdlib. I really don't think that's the right approach. If we're going to bother supporting distributed processing, we might as well support it in a portable way that can scale across machines.

Fork-and-coordinate is a specialized case of distribute-and-coordinate. Other d-a-c mechanisms can be provided, including those that utilize some form of RPC as a transport. SHM is orthogonal to all of this.

Note that scaling across machines is only equivalent to scaling across CPUs in the simple case; in more complicated cases, there's a lot of glue involved that grid frameworks like Boinc provide. If we end up shipping any cross-machine abilities in the stdlib, we'd have to make sure it's clear that we're not attempting to provide a grid framework, just the plumbing that someone could use to build one.

* Bite the mortar shell, and remove the GIL. This really isn't even an option because we're not throwing away the current C Python implementation. The C API would have to change quite a bit for that.

Hence 'mortar shell'. It can be done, but I think Guido's been pretty clear on it not happening anytime soon.

We have cooperatively scheduled microthreads with ugly syntax (yield), or more platform-specific and much less debuggable microthreads with stackless or greenlets.

Right. This is why I'm not sure we want to be recommending either as the Python way to do concurrency.

What use case requires sharing?

Strictly speaking, it's always avoidable. But in setup-heavy systems, avoiding SHM is a massive and costly pain. Consider web applications; ideally, you can preload one copy of all of your translations, database information, and other static information, into RAM -- and have worker threads do reads from this table as they're processing individual requests. Without SHM, you'd have to either duplicate the static set in memory for each CPU, or make individual requests for each desired piece of information to the master process that keeps the static set in RAM.

I've seen a number of computationally-bound systems that require an authoritative copy of a (large) dataset in RAM, and are OK with paying the cost of a read waiting on a lock during a write (and since writes only happen at the completion of complex calculations, they generally want to use locking like that provided by brlocks in the Linux kernel). All of this is workable without SHM, but some of it gets really unwieldy.

-- Ivan Krstić <krstic at solarsail.hcs.harvard.edu> | GPG: 0x147C722D

Previous message: [Python-3000] Kill GIL?
Next message: [Python-3000] Kill GIL?
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Python-3000 mailing list