pivot function on timezone aware objects does not preserve timezone info in resulting dataframe index · Issue #5878 · pandas-dev/pandas (original) (raw)
This bug is in 0.12.0
Using example DataFrame like below:
col data time
0 1 0 2013-03-22 11:00:00-04:00
1 2 1 2013-03-22 15:00:00-04:00
2 2 2 2013-03-22 11:00:00-04:00
3 1 3 2013-03-22 15:00:00-04:00
After pivoting, the old behavior in 0.10.1 properly preserved the timezone info in the index, resulting in a new DataFrame like such:
col 1 2
time
2013-03-22 11:00:00-04:00 0 2
2013-03-22 15:00:00-04:00 3 1
However in 0.12.0 this behavior is lost resulting in an index that does not have the timezone information
col 1 2
time
2013-03-22 15:00:00 0 2
2013-03-22 19:00:00 3 1
Below is the code to reproduce this issue:
import pandas print pandas.version import datetime import pandas as pn import pytz est = pytz.timezone('US/Eastern') dt1 = est.localize(datetime.datetime(2013,3,22,11,0,0)) dt2 = est.localize(datetime.datetime(2013,3,22,15,0,0)) df = pn.DataFrame({'time': [dt1, dt2, dt1, dt2], 'col': [1, 2, 2, 1], 'data': range(4)}) pivotDf = df.pivot('time', 'col', 'data') print df print pivotDf print pivotDf.index
the output from 0.10.1 is:
0.10.1
col data time
0 1 0 2013-03-22 11:00:00-04:00
1 2 1 2013-03-22 15:00:00-04:00
2 2 2 2013-03-22 11:00:00-04:00
3 1 3 2013-03-22 15:00:00-04:00
col 1 2
time
2013-03-22 11:00:00-04:00 0 2
2013-03-22 15:00:00-04:00 3 1
<class 'pandas.tseries.index.DatetimeIndex'>
[2013-03-22 11:00:00, 2013-03-22 15:00:00]
Length: 2, Freq: None, Timezone: US/Eastern
the output from 0.12.0 is:
0.12.0
col data time
0 1 0 2013-03-22 11:00:00-04:00
1 2 1 2013-03-22 15:00:00-04:00
2 2 2 2013-03-22 11:00:00-04:00
3 1 3 2013-03-22 15:00:00-04:00
col 1 2
time
2013-03-22 15:00:00 0 2
2013-03-22 19:00:00 3 1
<class 'pandas.tseries.index.DatetimeIndex'>
[2013-03-22 15:00:00, 2013-03-22 19:00:00]
Length: 2, Freq: None, Timezone: None
Notice the None in the "Timezone: " infor of the DatetimeIndex.