pd.concat reset the tz-aware index to UTC · Issue #18523 · pandas-dev/pandas (original) (raw)

Hi,
I reopen #18422 with copy-pastable example

Code Sample, a copy-pastable example if possible

import pandas as pd

idx1 = pd.date_range('2011-01-01', periods=3, freq='H', tz='Europe/Paris') idx2 = pd.date_range(start=idx1[0], end=idx1[-1], freq='H')

df1 = pd.DataFrame({'a': [1, 2, 3]}, index=idx1) df2 = pd.DataFrame({'b': [1, 2, 3]}, index=idx2)

res = pd.concat([df1, df2], axis=1)

print(df1.index.tzinfo) print(df2.index.tzinfo) print(res.index.tzinfo)

Output

Europe/Paris
Europe/Paris
UTC

Problem description

pd.concat reset the tz-aware index to UTC

This seems to come from the two different representation of time zone 'Europe/Paris' as this ipython output shows :

In [23]: df1.index.tzinfo
Out[23]: <DstTzInfo 'Europe/Paris' LMT+0:09:00 STD>

In [24]: df2.index.tzinfo
Out[24]: <DstTzInfo 'Europe/Paris' CET+1:00:00 STD>

In [26]: res.index.tzinfo
Out[26]: <UTC>

Expected Output

Europe/Paris
Europe/Paris
Europe/Paris

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.6.2.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 94 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None

pandas: 0.21.0
pytest: None
pip: 9.0.1
setuptools: 28.8.0
Cython: None
numpy: 1.13.1
scipy: None
pyarrow: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.0.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: 4.6.0
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None