BUG: Time zone information lost for some dateutil time zones · Issue #9663 · pandas-dev/pandas (original) (raw)
The dateutil package allows you to create time zone (tzfile
) objects two ways, either by using dateutil.tz.gettz
to read time zone data on the file system (/usr/share/zoneinfo), or by using dateutil.zoneinfo.gettz
to read time zone data from a tar file distributed in the dateutil package.
The tslib.maybe_get_tz
function doesn't handle the dateutil.tz.gettz
variant.
from datetime import datetime
import pandas as pd
import pandas.tslib as tslib
import dateutil.tz
import dateutil.zoneinfo
tz1 = dateutil.tz.gettz('America/New_York')
tz2 = dateutil.zoneinfo.gettz('America/New_York')
d1 = datetime(2015, 1, 1, tzinfo=tz1)
d2 = datetime(2015, 1, 1, tzinfo=tz2)
maybe_get_tz
returns None for tz1, but works correctly for tz2:
>>> tslib.maybe_get_tz('dateutil/' + tz1._filename)
>>> tslib.maybe_get_tz('dateutil/' + tz2._filename)
tzfile('America/New_York')
And so DatetimeIndexes are missing time zone information for those cases.
>>> pd.to_datetime([d1])
<class 'pandas.tseries.index.DatetimeIndex'>
[2015-01-01 05:00:00]
Length: 1, Freq: None, Timezone: None
>>> pd.to_datetime([d2])
<class 'pandas.tseries.index.DatetimeIndex'>
[2015-01-01 00:00:00-05:00]
Length: 1, Freq: None, Timezone: tzfile('America/New_York')
I think if maybe_get_tz
where to first try dateutil.zoneinfo.gettz, and then fall back on dateutil.tz.gettz, then the problem is solved.
This was a regression between 0.14 and 0.15.