(original) (raw)

On Mon, Nov 6, 2017 at 4:11 AM Nick Coghlan <ncoghlan@gmail.com> wrote:
Here's a more-complicated-than-a-doctest-for-a-dict-repo, but still
fairly straightforward, example regarding the "insertion ordering
dictionaries are easier to use correctly" argument:

import json
data = {"a":1, "b":2, "c":3}
rendered = json.dumps(data)
data2 = json.loads(rendered)
rendered2 = json.dumps(data2)
\# JSON round trip
assert data == data2, "JSON round trip failed"
\# Dict round trip
assert rendered == rendered2, "dict round trip failed"

Both of those assertions will always pass in CPython 3.6, as well as
in PyPy, because their dict implementations are insertion ordered,
which means the iteration order on the dictionaries is always "a",
"b", "c".

If you try it on 3.5 though, you should fairly consistently see that
last assertion fail, since there's nothing in 3.5 that ensures that
data and data2 will iterate over their keys in the same order.

You can make that code implementation independent (and sufficiently
version dependent to pass both assertions) by using OrderedDict:

from collections import OrderedDict
import json
data = OrderedDict(a=1, b=2, c=3)
rendered = json.dumps(data)
data2 = json.loads(rendered, object\_pairs\_hook=OrderedDict)
rendered2 = json.dumps(data2)
\# JSON round trip
assert data == data2, "JSON round trip failed"
\# Dict round trip
assert rendered == rendered2, "dict round trip failed"

However, despite the way this code looks, the serialised key order
\*might not\* be "a, b, c" on 3.5 and earlier (it will be on 3.6+, since
that already requires that kwarg order be preserved).

So the formally correct version independent code that reliably ensures
that the key order in the JSON file is always "a, b, c" looks like
this:

from collections import OrderedDict
import json
data = OrderedDict((("a",1), ("b",2), ("c",3)))
rendered = json.dumps(data)
data2 = json.loads(rendered, object\_pairs\_hook=OrderedDict)
rendered2 = json.dumps(data2)
\# JSON round trip
assert data == data2, "JSON round trip failed"
\# Dict round trip
assert rendered == rendered2, "dict round trip failed"
\# Key order
assert "".join(data) == "".join(data2) == "abc", "key order failed"

Getting from the "Works on CPython 3.6+ but is technically
non-portable" state to a fully portable correct implementation that
ensures a particular key order in the JSON file thus currently
requires the following changes:

Nick, it seems like this is more complicated than it needs to be. You can just pass sort\_keys=True to json.dump() / json.dumps(). I use it for tests and human-readability all the time.

—Chris



\- don't use a dict display, use collections.OrderedDict
\- make sure to set object\_pairs\_hook when using json.loads
\- don't use kwargs to OrderedDict, use a sequence of 2-tuples

For 3.6, we've already said that we want the last constraint to age
out, such that the middle version of the code also ensures a
particular key order.

The proposal is that in 3.7 we retroactively declare that the first,
most obvious, version of this code should in fact reliably pass all
three assertions.

Failing that, the proposal is that we instead change the dict
iteration implementation such that the dict round trip will start
failing reasonably consistently again (the same as it did in 3.5), so
that folks realise almost immediately that they still need
collections.OrderedDict instead of the builtin dict.

Cheers,
Nick.

\--
Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: https://mail.python.org/mailman/options/python-dev/chris.jerdonek%40gmail.com