[Python-Dev] iterzip() (original) (raw)

Neil Schemenauer nas@python.ca
Mon, 29 Apr 2002 15:52:47 -0700


Jeremy Hylton wrote:

I'm not sure what your trick is, since you've only described it as a "decref counter."

Sorry. I'm keeping track of the number of calls to PyObject_GC_Del since the last collection. While it's zero, collection doesn't happen. That makes the justzip function run fast but doesn't seem to help anywhere else.

I was imagining a scheme like this: Count increfs and decrefs. Set two thresholds. A collection occurs when both thresholds are exceeded. Perhaps 100 decrefs and 1000 increfs.

That would cut down on the collections more but I'm not sure how much in practice. In real code it seems like allocations and deallocations are pretty mixed up.

How does this come into play in the benchmark in question? It seems like we should have gotten a lot of quick collections, but it was still quite slow.

The GC cost is paid early and the objects get put in an older generation. Obviously that's a waste of time if they are deallocated in the near future. justpush deallocates as it goes so the GC is never triggered.

I just tried measuring the time spent in the GC while loading some nasty web pages in our system (stuff that looks at thousands of objects). I used the Pentium cycle counter since clock(2) seems to have very low resolution. Setting threshold0 to 7500 makes the GC take up twice the amount of time as with the default settings (700). That surprised me. I thought it wouldn't make much difference. Maybe I screwed up. :-)

Neil