(original) (raw)

Cool, thanks! I'm curious why this was brought up at all then...

On Dec 15, 2017 3:36 PM, "Raymond Hettinger" <raymond.hettinger@gmail.com> wrote:


\> On Dec 15, 2017, at 1:47 PM, Guido van Rossum <guido@python.org> wrote:
\>
\> On Fri, Dec 15, 2017 at 12:45 PM, Raymond Hettinger <raymond.hettinger@gmail.com> wrote:
\>
\> > On Dec 15, 2017, at 7:53 AM, Guido van Rossum <guido@python.org> wrote:
\> >
\> > Make it so. "Dict keeps insertion order" is the ruling.
\>
\> On Twitter, someone raised an interesting question.
\>
\> Is the guarantee just for 3.7 and later? Or will the blessing also cover 3.6 where it is already true.
\>
\> The 3.6 guidance is to use OrderedDict() when ordering is required. As of now, that guidance seems superfluous and may no longer be a sensible practice. For example, it would be nice for Eric Smith when he does his 3.6 dataclasses backport to not have to put OrderedDict back in the code.
\>
\> For 3.6 we can't change the language specs, we can just document how it works in CPython. I don't know what other Python implementations do in their version that's supposed to be compatible with 3.6 but I don't want to retroactively declare them non-conforming. (However for 3.7 they have to follow suit.) I also don't think that the "it stays ordered across deletions" part of the ruling is true in CPython 3.6.

FWIW, the regular dict does stay ordered across deletions in CPython3.6:

\>>> d = dict(a=1, b=2, c=3, d=4)
\>>> del d\['b'\]
\>>> d\['b'\] = 5
\>>> d
{'a': 1, 'c': 3, 'd': 4, 'b': 5}

Here's are more interesting demonstration:

from random import randrange, shuffle
from collections import OrderedDict

population = 1000000
s = list(range(population // 4))
shuffle(s)
d = dict.fromkeys(s)
od = OrderedDict.fromkeys(s)
for i in range(500000):
k = randrange(population)
d\[k\] = i
od\[k\] = i
k = randrange(population)
if k in d:
del d\[k\]
del od\[k\]
assert list(d.items()) == list(od.items())

The dict object insertion logic just appends to the arrays of keys, values, and hashvalues. When the number of usable elements decreases to zero (reaching the limit of the most recent array allocation), the dict is resized (compacted) left-to-right so that order is preserved.

Here are some of the relevant sections from the 3.6 source tree:

Objects/dictobject.c line 89:

Preserving insertion order

It's simple for combined table. Since dk\_entries is mostly append only, we can
get insertion order by just iterating dk\_entries.

One exception is .popitem(). It removes last item in dk\_entries and decrement
dk\_nentries to achieve amortized O(1). Since there are DKIX\_DUMMY remains in
dk\_indices, we can't increment dk\_usable even though dk\_nentries is
decremented.

In split table, inserting into pending entry is allowed only for dk\_entries\[ix\]
where ix == mp->ma\_used. Inserting into other index and deleting item cause
converting the dict to the combined table.

Objects/dictobject.c::insertdict() line 1140:

if (mp->ma\_keys->dk\_usable <= 0) {
/\* Need to resize. \*/
if (insertion\_resize(mp) < 0) {
Py\_DECREF(value);
return -1;
}
hashpos = find\_empty\_slot(mp->ma\_keys, key, hash);
}

Objects/dictobject.c::dictresize() line 1282:

PyDictKeyEntry \*ep = oldentries;
for (Py\_ssize\_t i = 0; i < numentries; i++) {
while (ep->me\_value == NULL)
ep++;
newentries\[i\] = \*ep++;
}

\>
\> I don't know what guidance to give Eric, because I don't know what other implementations do nor whether Eric cares about being compatible with those. IIUC micropython does not guarantee this currently, but I don't know if they claim Python 3.6 compatibility -- in fact I can't find any document that specifies the Python version they're compatible with more precisely than "Python 3".


I did a little research and here' what I found:

"MicroPython aims to implement the Python 3.4 standard (with selected features from later versions)"
\-- http://docs.micropython.org/en/latest/pyboard/reference/index.html

"PyPy is a fast, compliant alternative implementation of the Python language (2.7.13 and 3.5.3)."
\-- http://pypy.org/

"Jython 2.7.0 Final Released (May 2015)"
\-- http://www.jython.org/

"IronPython 2.7.7 released on 2016-12-07"
\-- http://ironpython.net/

So, it looks like your could say 3.6 does whatever CPython 3.6 already does and not worry about leaving other implementations behind. (And PyPy is actually ahead of us here, having compact and order-preserving dicts for quite a while).

Cheers,


Raymond