[Python-Dev] Billions of gc's (original) (raw)

Aahz aahz@pythoncraft.com
Tue, 30 Apr 2002 00:21:26 -0400


On Mon, Apr 29, 2002, Tim Peters wrote:

[Aahz]

My take is that programs with a million live objects and no cycles are common enough that gc should be designed to handle that smoothly. Well, millions of live objects is common but isn't a problem. The glitch we're looking at it is surprising slowdown with millions of live container objects. The latter isn't so common. I don't think that a programmer casually writing such applications (say, processing information from a database) should be expected to understand gc well enough to tune it. People casually writing applications pushing the limits of their boxes are in for more surprises than just this .

Fair enough. I hadn't quite understood that it was specifically container objects, but obviously a database result will have lots of tuples, so I think that's a good real-world metric for testing whatever solution is proposed.

Here's a question: suppose we've got a database result with 10K rows (I'd say that is fairly common), and we're processing each row with a regex (something that can't be done in SQL). What's a ballpark for gc overhead before and after your fix? (I'm still not set up to compile CVS, so I can't do it myself.)

Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/

"I used to have a .sig but I found it impossible to please everyone..." --SFJ