Better error message when handling bad keys in json data · Issue #4730 · pandas-dev/pandas (original) (raw)

Should read_json check for bad keys at the top level and catch them instead of passing them directly to the DataFrame constructor?

Currently there is this test case in io/tests/test_json/test_pandas.py that shadows a slightly confusing error message:

    # bad key
    json = StringIO('{"badkey":["A","B"],'
                    '"index":["2","3"],'
                    '"data":[[1.0,"1"],[2.0,"2"],[null,"3"]]}')
    self.assertRaises(TypeError, read_json, json,
                      orient="split")

If you change the TypeError to a ValueError so the test fails, you get the following traceback:

ERROR: pandas.io.tests.test_json.test_pandas:TestPandasContainer.test_frame_from_json_bad_data
  /usr/local/bin/vim +256 pandas/io/tests/test_json/test_pandas.py  # test_frame_from_json_bad_data
    orient="split")
  /usr/local/bin/vim +471 python2.7/unittest/case.py  # assertRaises
    callableObj(*args, **kwargs)
  /usr/local/bin/vim +178 pandas/io/json.py  # read_json
    date_unit).parse()
  /usr/local/bin/vim +238 pandas/io/json.py  # parse
    self._parse_no_numpy()
  /usr/local/bin/vim +456 pandas/io/json.py  # _parse_no_numpy
    self.obj = DataFrame(dtype=None, **decoded)
TypeError: __init__() got an unexpected keyword argument 'badkey'