[Python-3000] Is reference counting still needed? (original) (raw)

Tim Peters tim.peters at gmail.com
Thu Apr 20 12:41:27 CEST 2006


[Erno Kuusela]

The refcounting vs generational GC reasoning I've heard argues that refcounting is less cache-friendly: The cache line containing the refcount field of the pointed-to objects is dirtied (or at least loaded) every time something is done with the reference,

[Greg Ewing]

Has anyone actually measured this effect in a real system, or is it just theorising?

Of course people have tried it in real systems. Then they write about it, and everyone gets confused <0.5 wink>. Efficiency of gc strategy is deeply dependent, in highly complex ways, on many aspects of the system in question. It's a tempting but basically idiotic mistake to imagine, e.g., that a strategy that works well for LISP would also work well for CPython (or vice versa).

If it's a real effect, would this be helped at all if the refcounts weren't stored with the objects, but kept all together in one block of memory? Or would that just make things worse?

That's also been tried, and "it depends". An obvious thing about CPython is that you can't do anything non-trivial with an object O without reading up O's type pointer first -- but as soon as you do that, chances are high that you have O's refcount in L1 cache too, since the refcount is adjacent to the type pointer in the base PyObject struct, and that's true of all objects in CPython. "Reading up the refcount" is essentially free then, given that you had to read up the type pointer anyway (or "reading up the type pointer" is essentially free, given that you already read up the refcount).

With dynamic languages becoming increasingly important these days, I wonder whether anyone has thought about what sort of cache or other memory architecture modifications might make things like refcounting more efficient.

Yes, but I don't think anyone's offering to build a P3K chip for us ;-)



More information about the Python-3000 mailing list