[Python-Dev] Dictionary tuning (original) (raw)

Raymond Hettinger python@rcn.com
Mon, 28 Apr 2003 23:15:50 -0400


[Tim Peters]

Dicts can be larger after the patch, but never smaller, so there's nothing opposing the "can be larger" part: on average, allocated address space must be strictly larger than before.

I think of the resize intervals as steps on a staircase. My patch eliminates the even numbered stairs. The average logarithmic slope of the staircase doesn't change, there are just fewer discrete steps. Also, the height of the staircase doesn't change unless the top stair was even, in which case, another half step is added.

[Tim Peters]

Resizing a large dict is an expensive operation too. Not only are there fewer resizes, but the cost of the operation becomes cheaper because it takes less time to load a sparse dictionary than one that is more dense.

[Tim Peters]

Whether that matters on average to the average user is something we can answer rigorously just as soon as we find an average user with an average program . I'm not inclined to worry much about it.

Me either, I suspect that it is rare to find a stable application that is functioning just fine and consuming nearly all memory. Sooner or later, some change in data, hardware, os, or script would push it over the edge.

[Timothy Delaney]

That's what I was getting at. I know that (for example) most classes I create have less that 16 entries in their dict. With this change, each class instance would take (approx) twice as much memory for its dict. I suspect that class instance dict is the most common dictionary I use.

Those dicts would also be the ones benefitting from the patch. Their density would be halved; resulting in fewer collisions, improved search times, and better cache performance.

[Timothy Delaney]

Perhaps we need to add some internal profiling, so that "quickly-growing" dictionaries get larger reallocations ;)

I came up with this patch a couple of months ago and have since tried every tweak I could think of (apply to this size dict but not that one, etc) but found nothing that survived a battery of application benchmarks.

Have you guys tried out the patch? I'm very interested in getting results from different benchmarks, processors, cache sizes, and various operating systems.

sparse-is-better-than-dense-ly yours,

Raymond (currently, the only one. unlike two Tims, two Bretts, two Jacks and a Fredrik distinct from Fred)

################################################################# ################################################################# ################################################################# ##### ##### ##### ################################################################# ################################################################# #################################################################