[Python-Dev] More compact dictionaries with faster iteration (original) (raw)

Dag Sverre Seljebotn d.s.seljebotn at astro.uio.no
Wed Dec 12 21:37:08 CET 2012


On 12/12/2012 01:15 AM, Nick Coghlan wrote:

On Wed, Dec 12, 2012 at 5:37 AM, Dino Viehland <dinov at microsoft.com_ _<mailto:dinov at microsoft.com>> wrote:

OTOH changing certain dictionaries in IronPython (such as keyword args) to be ordered would certainly be possible. Personally I just wouldn't want to see it be the default as that seems like unnecessary overhead when the specialized class exists.

Which reminds me, I was going to note that one of the main gains with ordered keyword arguments, is their use in the construction of string-keyed objects where you want to be able to control the order of iteration (e.g. for serialisation or display purposes). Currently you have to go the path of something like namedtuple where you define the order of iteration in one operation, and set the values in another.

So here's a brand new argument against ordered dicts: The existence of perfect hashing schemes. They fundamentally conflict with ordered dicts.

I played with using them for vtable dispatches in Cython this summer, and they can perform really, really well for branch-predicted lookups in hot loops, because you always/nearly always eliminate linear probing and so there's no branch misses or extra comparisons. (The overhead of a perfect hash table lookup over a traditional vtable lookup was only a couple of cycles in my highly artificial fully branch-predicted micro-benchmark.)

There's some overhead in setup; IIRC, ~20 microseconds for 64 elements, 2 GHz CPU, though that was a first prototype implementation and both algorithmic improvements and tuning should be possible.

So it's not useful for everything, but perhaps for things like module dictionaries and classes an "optionally perfect" dict can make sense.

Note: I'm NOT suggesting the use of perfect hashing, just making sure it's existence is mentioned and that one is aware that if always-ordered dicts become the language standard it precludes this option far off in the future.

(Something like a sort() method could still work and make the dict "unperfect"; one could also have a pack() method that made the dict perfect again.).

That concludes the on-topic parts of my post.

-- Dag Sverre Seljebotn

APPENDIX

Going off-topic for those who are interested, here's the longwinded and ugly details. My code [1] is based on the paper [2] (psuedo-code in Appendix A), but I adapted it a bit to be useful for tens/hundreds of elements rather than billions.

The ingredients:

  1. You need the hash to be 32 bits (or 64) of good entropy (md5 or murmurhash or similar). (Yes, that's a tall order for CPython, I'm just describing the scheme.) (If the hash collides on all bits you will collide, so some fallback is still necesarry, just unlikely.)

  2. To lookup, the idea is (psuedo-code!)

typedef struct { int m_f m_g, r, k; int16_t d[k]; /* "small" int, like current proposal */ } table_header_t;

And then one computes index of an element with hash "h" using the function

((h >> tab->r) & tab->m_f) ^ tab->d[h & tab->m_g]

rather than the usual "h % n". While more arithmetic, arithmetic is cheap and branch misses are not.

  1. To set up/repack a table one needs to find the parameters. The general idea is:

a) Partition the hashes into k bins by using "h & m_g". There will be collisions, but the number of bins with many collisions will be very small; most bins will have 2 or 1 or 0 elements.

b) Starting with the largest bin, distribute the elements according to the hash function. If a bin collides with the existing contents, try another value for d[binindex] until it doesn't.

The r parameter let's you try again 32 (or 64) times to find a solution. In my testcases there was ~0.1% chance of not finding a solution (that is, exhausting possible choices of r) with 64-bit hashes with 4 or 8 elements and no empty table elements. For any other number of elements, or with some empty elements, the chance of failure was much lower.)

[1] It's not exactly a great demo, but it contains the algorithm. If there's much interest I should clean it up and make a proper benchmark demo out of it:

https://github.com/dagss/pyextensibletype/blob/perfecthash/include/perfecthash.h

[2] Pagh (1999)

http://www.brics.dk/RS/99/13/BRICS-RS-99-13.ps.gz



More information about the Python-Dev mailing list