read_json: ValueError: Value is too big · Issue #26068 · pandas-dev/pandas (original) (raw)
Reopening issue #14530. The close description is incorrect. The JSON specification explicitly states that limits are not in the specification.
From https://tools.ietf.org/html/rfc7159#section-6
This specification allows implementations to set limits on the range and precision of numbers accepted
The standard json
library in python supports large numbers, meaning the language supports JSON with these values.
Python 3.6.8 (default, Apr 7 2019, 21:09:51)
[GCC 5.3.1 20160406 (Red Hat 5.3.1-6)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> j = """{"id": 9253674967913938907}"""
>>> json.loads(j)
{'id': 9253674967913938907}
Loading a json file with large integers (> 2^32), results in "Value is too big". I have tried changing the orient to "records" and also passing in dtype={'id': numpy.dtype('uint64')}. The error is the same.
import pandas data = pandas.read_json('''{"id": 10254939386542155531}''') print(data.describe())
Expected Output
id count 1 unique 1 top 10254939386542155531 freq 1
Actual Output (even with dtype passed in)
File "./parse_dispatch_table.py", line 34, in <module> print(pandas.read_json('''{"id": 10254939386542155531}''', dtype=dtype_conversions).describe()) File "/users/XXX/.local/lib/python3.4/site-packages/pandas/io/json.py", line 234, in read_json date_unit).parse() File "/users/XXX/.local/lib/python3.4/site-packages/pandas/io/json.py", line 302, in parse self._parse_no_numpy() File "/users/XXX/.local/lib/python3.4/site-packages/pandas/io/json.py", line 519, in _parse_no_numpy loads(json, precise_float=self.precise_float), dtype=None) ValueError: Value is too big
No problem using read_csv:
import pandas import io print(pandas.read_csv(io.StringIO('''id\n10254939386542155531''')).describe())
Output using read_csv
id count 1 unique 1 top 10254939386542155531 freq 1
Output of
pd.show_versions()