[Python-Dev] Removing the GIL (Me, not you!) (original) (raw)

Adam Olsen rhamph at gmail.com
Fri Sep 14 18:33:09 CEST 2007


On 9/14/07, Justin Tulloss <tulloss2 at uiuc.edu> wrote:

On 9/14/07, Adam Olsen <rhamph at gmail.com> wrote: > > Could be worth a try. A first step might be to just implement > > the atomic refcounting, and run that single-threaded to see > > if it has terribly bad effects on performance. > > I've done this experiment. It was about 12% on my box. Later, once I > had everything else setup so I could run two threads simultaneously, I > found much worse costs. All those literals become shared objects that > create contention. It's hard to argue with cold hard facts when all we have is raw speculation. What do you think of a model where there is a global "thread count" that keeps track of how many threads reference an object? Then there are thread-specific reference counters for each object. When a thread's refcount goes to 0, it decrefs the object's thread count. If you did this right, hopefully there would only be cache updates when you update the thread count, which will only be when a thread first references an object and when it last references an object. I mentioned this idea earlier and it's growing on me. Since you've actually messed around with the code, do you think this would alleviate some of the contention issues?

There would be some poor worst-case behaviour. In the case of literals you'd start referencing them when you call a function, then stop when the function returns. Same for any shared datastructure.

I think caching/buffering refcounts in general holds promise though. My current approach uses a crude hash table as a cache and only flushes when there's a collision or when the tracing GC starts up. So far I've only got about 50% of the normal performance, but that's with 90% or more scalability, and I'm hoping to keep improving it.

-- Adam Olsen, aka Rhamphoryncus



More information about the Python-Dev mailing list