[Python-Dev] More compact dictionaries with faster iteration (original) (raw)

PJ Eby pje at telecommunity.com
Tue Dec 11 00:05:04 CET 2012

Previous message: [Python-Dev] More compact dictionaries with faster iteration
Next message: [Python-Dev] More compact dictionaries with faster iteration
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Mon, Dec 10, 2012 at 4:29 PM, Tim Delaney <tim.delaney at aptare.com> wrote:

Whilst I think Python should not move to ordered dictionaries everywhere, I would say there is an argument (no pun intended) for making **kwargs a dictionary that maintains insertion order if there are no deletions.

Oooh. Me likey. There have been many times where I've wished kwargs were ordered when designing an API.

(Oddly, I don't remember any one of the APIs specifically, so don't ask me for a good example. I just remember a bunch of different physical locations where I was when I thought, "Ooh, what if I could... no, that's not going to work.")

One other useful place for ordered dictionaries is class definitions processed by class decorators: no need to write a metaclass just to know what order stuff was defined in.

It sounds like we could get that for free with this implementation, although from another post IronPython might not have something suitable.

Actually, IronPython may already have ordered dictionaries by default; see:

http://mail.python.org/pipermail/ironpython-users/2006-May/002319.html

It's described as an implementation detail that may change, perhaps that could be changed to being unchanging. ;-)

I think there are real advantages to doing so - a trivial one being the ability to easily initialise an ordered dictionary from another ordered dictionary.

Or to merge two of them together, either at creation or .update().

I'm really starting to wonder if it might not be worth paying the compaction overhead to just make all dictionaries ordered, all the time. The problem is that if you are always adding new keys and deleting old ones (as might be done in a LRU cache, a queue, or other things like that) you'll probably see a lot of compaction overhead compared to today's dicts.

OTOH... with a good algorithm for deciding when to compact, we can actually make the amortized cost O(1) or so, so maybe that's not a big deal. The cost to do a compaction is at worst, the current size of the table. So if you wait until a table has twice as many entries (more precisely, until the index of the last entry is twice what it was at last compaction), you will amortize the compaction cost down to one entry move per add, or O(1).

That would handle the case of a cache or queue, but I'm not sure how it would work with supersized dictionaries that are then deleted down to a fraction of their original size. I suppose if you delete your way down to half the entries being populated, then you end up with two moves per delete, or O(2). (Yes, I know that's not a valid O number.)

So, offhand, it seems pretty doable, and unlikely to significantly change worst-case performance even for pathological use cases. (Like using a dict when you should be using a deque.)

Previous message: [Python-Dev] More compact dictionaries with faster iteration
Next message: [Python-Dev] More compact dictionaries with faster iteration
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Python-Dev mailing list