pd.concat reset the tz-aware index to UTC · Issue #18523 · pandas-dev/pandas (original) (raw)
Hi,
I reopen #18422 with copy-pastable example
Code Sample, a copy-pastable example if possible
import pandas as pd
idx1 = pd.date_range('2011-01-01', periods=3, freq='H', tz='Europe/Paris') idx2 = pd.date_range(start=idx1[0], end=idx1[-1], freq='H')
df1 = pd.DataFrame({'a': [1, 2, 3]}, index=idx1) df2 = pd.DataFrame({'b': [1, 2, 3]}, index=idx2)
res = pd.concat([df1, df2], axis=1)
print(df1.index.tzinfo) print(df2.index.tzinfo) print(res.index.tzinfo)
Output
Europe/Paris
Europe/Paris
UTC
Problem description
pd.concat reset the tz-aware index to UTC
This seems to come from the two different representation of time zone 'Europe/Paris' as this ipython output shows :
In [23]: df1.index.tzinfo
Out[23]: <DstTzInfo 'Europe/Paris' LMT+0:09:00 STD>
In [24]: df2.index.tzinfo
Out[24]: <DstTzInfo 'Europe/Paris' CET+1:00:00 STD>
In [26]: res.index.tzinfo
Out[26]: <UTC>
Expected Output
Europe/Paris
Europe/Paris
Europe/Paris
Output of pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.6.2.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 94 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None
pandas: 0.21.0
pytest: None
pip: 9.0.1
setuptools: 28.8.0
Cython: None
numpy: 1.13.1
scipy: None
pyarrow: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.0.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: 4.6.0
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None