[Python-Dev] Billions of gc's (original) (raw)
Tim Peters tim.one@comcast.net
Mon, 29 Apr 2002 23:55:41 -0400
- Previous message: [Python-Dev] Billions of gc's
- Next message: [Python-Dev] Billions of gc's
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
[Aahz]
My take is that programs with a million live objects and no cycles are common enough that gc should be designed to handle that smoothly.
Well, millions of live objects is common but isn't a problem. The glitch we're looking at it is surprising slowdown with millions of live container objects. The latter isn't so common.
I don't think that a programmer casually writing such applications (say, processing information from a database) should be expected to understand gc well enough to tune it.
People casually writing applications pushing the limits of their boxes are in for more surprises than just this .
Having read the entire discussion so far, and NOT being any kind of gc expert, I would say that Tim's adaptive solution makes the most sense to me. For years, we told people with cyclic data to figure out how to fix the problem themselves; now that we have gc available, I don't think we should punish everyone else.
We're not trying to punish anyone, but innocent users with lots of containers can lose big despite our wishes: if we don't check them for cycles, they can run out of memory; if we do check them for cycles, it necessarily consumes time.
As a datapoint, here are the times (in seconds) for justzip() on my box after my checkin to precompute the result size (list.append behavior is irrelevant now):
gc disabled: 0.64 gc enabled: 7.32 magic=2(): 2.63 magic=3(): 2.02
(*) This is gcmodule.c fiddled to add this block after "collections1 = 0;" in the first branch of collect_generations():
if (n == 0)
threshold2 *= magic;
else if (threshold2 > 5)
threshold2 /= magic;
magic=1 is equivalent to the current code. That's all an "adaptive scheme" need amount to, provided the "*=" part were fiddled to prevent threshold2 from becoming insanely large. Boosting magic above 3 didn't do any more good in this test.
At magic=3 it still takes 3+ times longer than with gc disabled, but that's a whale of a lot better than the current 11+ times longer. Note that with gc disabled, any cycle in any of the 1,000,001 containers this test creates would leak forever -- casual users definitely get something back for the time spent.
- Previous message: [Python-Dev] Billions of gc's
- Next message: [Python-Dev] Billions of gc's
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]