ENH: json_normalize should allow a different separator than . · Issue #14883 · pandas-dev/pandas (original) (raw)
import pandas col_in = ['c1', 'c2.x'] df = pandas.DataFrame([['A', 0], ['B', 1]], columns=col_in) df.c1 0 A 1 B Name: c1, dtype: object df.c2.x Traceback (most recent call last): File "", line 1, in File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/core/generic.py", line 2744, in getattr return object.getattribute(self, name) AttributeError: 'DataFrame' object has no attribute 'c2'
Problem description
The above snippet shows that it's not ideal to have .
as a character in a column name. (I'm running into this when using Vega for data visualization, vega/vega-lite#1775.) When json_normalize
flattens a nested input JSON, it separates the nesting levels with a .
. I believe this happens on this line:
meta_keys = ['.'.join(val) for val in meta] |
---|
I'd like to see an additional argument to json_normalize
, separator
, with default .
, that specified the character (string) that separated nesting levels. In the line of code above, '.'.join(val)
would be replaced by separator.join(val)
(if I'm reading what that line does correctly). I could use, say, _
to use underscore instead of period.
n00b at pandas, please correct me if I'm doing anything wrong.
Expected Output
Output of pd.show_versions()
>>> pandas.show_versions()
INSTALLED VERSIONS
commit: None
python: 2.7.12.final.0
python-bits: 64
OS: Darwin
OS-release: 16.3.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.US-ASCII
LOCALE: None.None
pandas: 0.19.1
nose: 1.3.7
pip: 9.0.1
setuptools: 30.3.0
Cython: 0.25.2
numpy: 1.11.2
scipy: 0.18.1
statsmodels: None
xarray: None
IPython: 5.1.0
sphinx: 1.5.1
patsy: None
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: 1.2.0
tables: 3.2.3.1
numexpr: 2.6.1
matplotlib: 1.5.3
openpyxl: 2.4.1
xlrd: 1.0.0
xlwt: None
xlsxwriter: None
lxml: 3.6.0
bs4: 4.5.1
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.8
boto: None
pandas_datareader: None