[Python-Dev] GC Proposal (original) (raw)

Adam Olsen rhamph at gmail.com
Fri Jun 27 02:53:15 CEST 2008


We need two counters: one is the total number of traceable objects (those we would inspect if we did a full collection) and a number of "pending" trace operations. Every time an object is moved into the last generation, we increase "pending" by two - once for itself and once for an older object. Once pending equals the total number of traceable objects we do a full collection (and reset "pending" to 0).

This behaves well under both extremes. First, if allocating in a burst, it waits until the total number of objects has doubled before doing a collection, which works out to about 2 passes per object.

Second, a long running program may allocate an object, get it into the oldest generation (adding to "pending"), but then delete it through normal means. The pending trace is effectively borrowed by the remaining older objects, which ensures they will get scanned again eventually. However, we already payed for it when the young object was allocated, so the net cost is still a constant factor.

Various refinements are possible, such as only adding 1.5 to "pending", or taking off some of the 2 if a young object is deleted without getting traced. We could also tweak when in the object life-cycle "pending" is increased.

The first two generations would remain as they are today.

-- Adam Olsen, aka Rhamphoryncus



More information about the Python-Dev mailing list