[Python-Dev] Dictionary tuning (original) (raw)

Raymond Hettinger python@rcn.com
Mon, 28 Apr 2003 19:50:14 -0400


[jack, master of the mac]

You wouldn't have some created some handy tables of 'typical' dictionary usage, would you? They would be interesting in general, but very nice for the PEPs doing dict optimizations for symbol tables in particular.

That path proved fruitless. I studied the usage patterns in about a dozen of my apps and found that there is no such thing as typical. Instead there are many categories of dictionary usage.

Almost every dictionary tune-up that helped one app would end-up hurting another. The only way to test the effectiveness of a change was to time a whole suite of applications. The standard benchmarks were useless in this regard. Worse still, contrived test programs would not predict the results for real apps. There were several reasons for this:

I had done some experiments that focused on symbol tables and had some luck with sequential searches into a self-organizing list. Using a list eliminated the holes and allowed more of the entries to fit in a single cache line. No placeholders were needed for deleted entries and that saves a test in the search loop. The self-organizing property kept the most frequently accessed entries at the head of the list. Using a sequential search had slightly less overhead than the hash table search pattern. Except for the builtin dictionary, most of the symbol tables in my apps have only a handful of entries.

if-only-i-had-had-a-single-valid-dict-performance-predictor-ly yours,

Raymond Hettinger