BUG: ujson labels are encoded twice by Komnomnomnom · Pull Request #4593 · pandas-dev/pandas (original) (raw)
With its current handling ujson ends up encoding labels twice, which can cause problems if they contain escapable characters:
In [16]: df = DataFrame([['a', 'b'], ['c', 'd']], index=['index " 1', 'index / 2'], columns=['a \ b', 'y / z'])
In [17]: df Out[17]: a \ b y / z index " 1 a b index / 2 c d
In [18]: json = df.to_json()
In [19]: json Out[19]: '{"a \\\\ b":{"index \\\" 1":"a","index \\\/ 2":"c"},"y \\\/ z":{"index \\\" 1":"b","index \\\/ 2":"d"}}'
In [20]: pd.read_json(json) Out[20]: a \ b y / z index " 1 a b index / 2 c d
This PR fixes this behaviour so labels are only encoded a single time.