[Python-Dev] Guarantee ordered dict literals in v3.7? (original) (raw)

Steven D'Aprano steve at pearwood.info
Mon Nov 6 22:51:49 EST 2017


On Mon, Nov 06, 2017 at 12🔞17PM +0200, Paul Sokolovsky wrote:

> I don't think that situation should change the decision,

Indeed, it shouldn't. What may change it is the simple and obvious fact that there's no need to change anything, as proven by the 25-year history of the language.

I disagree -- the history of Python shows that having dicts be unordered is a PITA for many Python programmers. Python eventually gained an ordered dict because it provides useful functionality that developers demand.

Every new generation of Python programmers comes along and gets confused by why dicts mysteriously change their order from how they were entered, why doctests involving dicts break, why keyword arguments lose their order, why they have to import a module to get ordered dicts instead of having it be built-in, etc. Historically, we had things like ConfigParser reordering ini files when you write them.

Having dicts be unordered is not a positive virtue, it is a limitation. Up until now, it was the price we've paid for having fast, O(1) dicts. Now we have a dict implementation which is fast, O(1) and ordered. Why pretend that we don't? This is a long-requested feature, and the cost appears to be small: by specifying this, all we do is rule out some, but not all, hypothetical future optimizations.

Unordered dicts served CPython well for 20+ years, but I doubt many people will miss them.

What happens now borders on technologic surrealism - the CPython, after many years of persuasion, switched its dict algorithm, rather inefficient in terms of memory, to something else, less inefficient (still quite inefficient, taking "no overhead" as the baseline).

Trading off space for time is a very common practice. You said that lookups on MicroPython's dicts are O(N). How efficient is µPy when doing a lookup of a dict with ten million keys?

µPy has chosen to optimize for space, rather than time. That's great. But I don't think you should sneer at CPython's choice to optimize for time instead.

And given that µPy's dicts already fail to meet the expected O(1) dict behviour, and the already large number of functional differences (not just performance differences) between µPy and Python:

http://docs.micropython.org/en/latest/pyboard/genrst/index.html

I don't think that this will make much practical difference. MicroPython users already cannot expect to run arbitrary Python code that works in other implementations: the Python community is fragmented between µPy code written for tiny machines, and Python code for machines with lots of memory.

That algorithm randomly had another property. Now there's a seemingly serious talk of letting that property leak into the language spec,

It will no more be a "leak" than any other deliberate design choice.

despite the fact that there can be unlimited number of dictionary algorithms, most of them not having that property.

Sure. So what? There's an unlimited number of algorithms that don't provide the functionality that we want. There are an unlimited number of sort algorithms, but Python guarantees that we're only going to use those that are stable. Similar applies for method resolution (which µPy already violates), strings, etc.

What it will lead to is further fragmentation of the community.

Aren't you concerned about fragmenting the community because of the functional differences between MicroPython and the specs?

Sometimes a small amount of fragmentation is unavoidable, and not necessarily a bad thing.

> P.S. If anyone does want to explore MicroPython's dict implementation, > and see if there might be an alternate implementation strategy that > offers both O(1) lookup and guaranteed ordering without using > additional memory

That would be the first programmer in the history to have a cake and eat it too. Memory efficiency, runtime efficiency, sorted order: choose 2 of 3.

Given that you state that µPy dicts are O(N) and unordered, does that mean you picked only 1 out of 3?

-- Steve



More information about the Python-Dev mailing list