[Python-Dev] iterzip() (original) (raw)
Jeremy Hylton jeremy@zope.com
Mon, 29 Apr 2002 18:31:57 -0400
- Previous message: [Python-Dev] Billions of gc's
- Next message: [Python-Dev] iterzip()
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
"GvR" == Guido van Rossum <guido@python.org> writes:
I was imagining a scheme like this: Count increfs and decrefs. Set two thresholds. A collection occurs when both thresholds are exceeded. Perhaps 100 decrefs and 1000 increfs.
GvR> I expect you can't literally count increfs and decrefs. These GvR> are macros that need to be super-fast, and I think we can't GvR> really afford to increment a counter on eacn macro invocation.
GvR> The current thresholds are used to count the number of GvR> allocations.
I was being sloppy. I meant allocations and deallactions.
GvR> Adding the number of deallocations to mix seems dangerous: when GvR> (nearly) all data is tied up in cycles, there may not be any GvR> deallocations.
Probably right, although function calls should provide a steady stream of deallocations. Frame, locals, &c. are deallocated on exit. So unless the code is blocked waiting for those cycles to be collected, it ought to eventually make progress.
GvR> It seems hard to distinguish between these two cases:
GvR> (a) lots of memory is allocated and kept alive for real by GvR> containers
GvR> (b) lots of memory is allocated and kept alive accidentally by GvR> cycles
GvR> The zip example is a case of (a), but the same allocation GvR> behavior could ensue from case (b). Only running the allocator GvR> can determine which case we're seeing. I like Tim's idea of GvR> adjusting the thresholds base upon the effectiveness of recent GvR> collections.
I agree that this sounds interesting.
How does this come into play in the benchmark in question? It seems like we should have gotten a lot of quick collections, but it was still quite slow.
GvR> The benchmark has a list with a million elements, and during GvR> execution more and more tuples are added to it. I expect that GvR> somehow all these tuples are visited every 700 allocations, GvR> which gives quadratic behavior. I guess the generational part GvR> of the collector doesn't affect this --
I guess this is a question for Neil. I assumed that the generational part would affect this.
GvR> the only way to reduce the work would be if the big list GvR> somehow was skipped once it was known to be "old". But since GvR> it contains references to "new" object (the 700 tuples GvR> allocated last), it probably ends up being visited anyway.
I thought something was labelled old after it survived some number of collections. Why would the age of the objects it contains have any affect?
Jeremy
- Previous message: [Python-Dev] Billions of gc's
- Next message: [Python-Dev] iterzip()
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]