DataFrame.to_json silently ignores index parameter for most orients. · Issue #25513 · pandas-dev/pandas (original) (raw)

see #25513 (comment)

Code Sample, a copy-pastable example if possible

from future import print_function

import pandas as pd

df = pd.DataFrame.from_records( [ {'id': 'abc', 'data': 'qwerty'}, {'id': 'def', 'data': 'uiop'} ], index='id' )

print(df.to_json(orient='records', index=True))

Prints out:

[{"data":"qwerty"},{"data":"uiop"}]

series = df.squeeze() print(series.to_json(orient='records', index=True))

Prints out:

["qwerty","uiop"]

Problem description

When creating a DataFrame that has two columns, one to be used as an index and another for the data, if you call .to_json(orient='records') the index is omitted. I know that in theory I should be using a Series for this, but I'm using it to convert a CSV file into JSONL and I don't know what the CSV file is going to look like ahead of time.

In any case, squeezing the DataFrame into a Series doesn't work either. In fact, the bug in Series.to_json is even worse, as it produces an array of strings instead of an array of dictionaries.

This bug is present in master.

Expected Output

Expected output is:

[{"id":"abc", "data":"qwerty"},{"id":"def","data":"uiop"}]

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.7.2.final.0
python-bits: 64
OS: Darwin
OS-release: 17.7.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.24.1
pytest: 4.3.0
pip: 19.0.3
setuptools: 40.8.0
Cython: 0.29.6
numpy: 1.16.2
scipy: None
pyarrow: 0.12.1
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.8.0
pytz: 2018.9
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml.etree: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: 2.7.7 (dt dec pq3 ext lo64)
jinja2: None
s3fs: 0.2.0
fastparquet: 0.2.1
pandas_gbq: None
pandas_datareader: None
gcsfs: None