Issue 34972: json dump silently converts int keys to string (original) (raw)

Created on 2018-10-13 10:16 by My-Tien Nguyen, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (9)

msg327649 - (view)

Author: My-Tien Nguyen (My-Tien Nguyen)

Date: 2018-10-13 10:24

When int keys are silently converted to string on json serialization, the user needs to remember to convert it back to int on loading. I think that a warning should be shown at least.

In my case I serialize a dict to json with int keys, later load it back into a dict (resulting in a dict with string keys) and test for existence of an int key in the dict which will then return False incorrectly.

I am aware that json does not support int keys, but this can be easily forgotten.

msg327650 - (view)

Author: Karthikeyan Singaravelan (xtreak) * (Python committer)

Date: 2018-10-13 11:48

Thanks for the report. There was a related issue few days back . I think this is a documented behavior at https://docs.python.org/3.8/library/json.html#json.dumps . Having a warning in place might break code and I don't know if there is a safe way to introduce this as a code level warning given that this is a documented behavior in Python 2 and 3. I think this is the case with other languages too like JavaScript itself converting int to string without warning adhering to JSON standard. Correct me if I am wrong or other languages have a warning related to this

Note: Keys in key/value pairs of JSON are always of the type str. When a dictionary is converted into JSON, all the keys of the dictionary are coerced to strings. As a result of this, if a dictionary is converted into JSON and then back into a dictionary, the dictionary may not equal the original one. That is, loads(dumps(x)) != x if x has non-string keys

You can try doing json.loads(data, parse_int=int) but it will try converting the values.

json.loads(json.dumps({1:'1'}), parse_int=int) {'1': '1'} json.loads(json.dumps({1:1}), parse_int=int) {'1': 1}

Thanks

msg327654 - (view)

Author: My-Tien Nguyen (My-Tien Nguyen)

Date: 2018-10-13 14:56

I don’t think, “other languages do that too” is a good argument here. This would apply if behaving differently would break user expectation. But here we would do nothing more than explicitly inform the user of a relevant operation. If they already expected that behaviour, they can disregard the warning.

I don’t see how parse_intwould help me here, I would need a parse_str=int, but then it would try to parse every string, and I don’t see the use case for that.

I would suggest a warning similar to this:

--- json/encoder.py +++ json/encoder.py @@ -1,6 +1,7 @@ """Implementation of JSONEncoder """ import re +import warnings

try: from _json import encode_basestring_ascii as c_encode_basestring_ascii @@ -353,7 +354,9 @@ items = sorted(dct.items(), key=lambda kv: kv[0]) else: items = dct.items()

@@ -403,6 +406,8 @@ else: chunks = _iterencode(value, _current_indent_level) yield from chunks

msg327684 - (view)

Author: Eric V. Smith (eric.smith) * (Python committer)

Date: 2018-10-14 00:25

I can't think of another place where we issue a warning for anything similar. I'm opposed to any changes here: it's clearly documented behavior.

It's like being surprised .ini files convert to strings: it's just how that format works.

msg327688 - (view)

Author: Steve Dower (steve.dower) * (Python committer)

Date: 2018-10-14 03:12

Agreed with Eric. json.dump needs to produce valid JSON, which requires keys to be strings.

Try using pickle if you need to preserve full Python semantics.

msg327703 - (view)

Author: My-Tien Nguyen (My-Tien Nguyen)

Date: 2018-10-14 11:32

Sure, I can do that, but wanted to propose this regardless. I guess this is a disagreement on a language design level. As a proponent of strong typing I wouldn’t have allowed non-string keys in the first place, and if they are allowed I would warn about conversion. This is also more aligned with the “explicit is better than implicit” principle.

msg361302 - (view)

Author: Facundo Batista (facundobatista) * (Python committer)

Date: 2020-02-03 14:42

I understand (and agree with) the merits of automatically converting the int to str when dumping to a string.

However, this result really surprised me:

json.dumps({1:2, "1":3}) '{"1": 2, "1": 3}'

Is it a valid JSON?

msg365838 - (view)

Author: Stub (Stub2)

Date: 2020-04-06 06:03

Similarly, keys can be lost entirely:

json.dumps({1:2, 1.0:3}) '{"1": 3}'

msg365840 - (view)

Author: Stuart Bishop (stub)

Date: 2020-04-06 06:14

(sorry, my example is normal Python behavior. {1:1, 1.0:2} == {1:2} , {1.0:1} == {1:1} )

History

Date

User

Action

Args

2022-04-11 14:59:07

admin

set

github: 79153

2020-04-06 06:14:52

stub

set

nosy: + stub
messages: +

2020-04-06 06:03:29

Stub2

set

nosy: + Stub2
messages: +

2020-02-03 14:42:30

facundobatista

set

nosy: + facundobatista
messages: +

2018-10-14 11:46:08

eric.smith

set

resolution: not a bug

2018-10-14 11:33:13

My-Tien Nguyen

set

status: open -> closed

2018-10-14 11:32:16

My-Tien Nguyen

set

status: closed -> open
resolution: not a bug -> (no value)
messages: +

2018-10-14 03:12:57

steve.dower

set

status: open -> closed

nosy: + steve.dower
messages: +

resolution: not a bug
stage: resolved

2018-10-14 00:25:04

eric.smith

set

nosy: + eric.smith
messages: +

2018-10-13 14:56:17

My-Tien Nguyen

set

messages: +

2018-10-13 11:48:46

xtreak

set

nosy: + xtreak
messages: +

2018-10-13 10:24:23

My-Tien Nguyen

set

type: behavior
messages: +
components: + Library (Lib)
versions: + Python 3.6

2018-10-13 10:16:46

My-Tien Nguyen

create