[Python-Dev] iterzip() (original) (raw)
Guido van Rossum guido@python.org
Mon, 29 Apr 2002 18:19:10 -0400
- Previous message: [Python-Dev] iterzip()
- Next message: [Python-Dev] iterzip()
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
I was imagining a scheme like this: Count increfs and decrefs. Set two thresholds. A collection occurs when both thresholds are exceeded. Perhaps 100 decrefs and 1000 increfs.
I expect you can't literally count increfs and decrefs. These are macros that need to be super-fast, and I think we can't really afford to increment a counter on eacn macro invocation.
The current thresholds are used to count the number of allocations. Adding the number of deallocations to mix seems dangerous: when (nearly) all data is tied up in cycles, there may not be any deallocations. It seems hard to distinguish between these two cases:
(a) lots of memory is allocated and kept alive for real by containers
(b) lots of memory is allocated and kept alive accidentally by cycles
The zip example is a case of (a), but the same allocation behavior could ensue from case (b). Only running the allocator can determine which case we're seeing. I like Tim's idea of adjusting the thresholds base upon the effectiveness of recent collections.
How does this come into play in the benchmark in question? It seems like we should have gotten a lot of quick collections, but it was still quite slow.
The benchmark has a list with a million elements, and during execution more and more tuples are added to it. I expect that somehow all these tuples are visited every 700 allocations, which gives quadratic behavior. I guess the generational part of the collector doesn't affect this -- the only way to reduce the work would be if the big list somehow was skipped once it was known to be "old". But since it contains references to "new" object (the 700 tuples allocated last), it probably ends up being visited anyway.
--Guido van Rossum (home page: http://www.python.org/~guido/)
- Previous message: [Python-Dev] iterzip()
- Next message: [Python-Dev] iterzip()
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]