tz_localize creates freq inconsistency on time offset change · Issue #30511 · pandas-dev/pandas (original) (raw)

Code Sample, a copy-pastable example if possible

import pandas as pd

index = pd.date_range("2019-3-31", freq="30T", periods=10, tz="Europe/London") print(index) # this has a freq print(index.freq) print(index.tz_localize(None)) # this shouldn't have it anymore, but it does print(index.tz_localize(None).freq)

Problem description

tz_localize(None) doesn't check if the frequency is still consistent with the new localized index. In the case of Daylight Saving Time switch, the frequency of the index should become None (since duplicated/missing timestamps are created).

Expected Output

Achievable with:

new_index = index.tz_localize(None) new_index.freq = new_index.inferred_freq # drop frequency if not inferrable print(new_index) # this is correct print(new_index.freq)

Output of pd.show_versions()

INSTALLED VERSIONS

commit : None
python : 3.7.5.final.0
python-bits : 64
OS : Windows
OS-release : 10
machine : AMD64
processor : Intel64 Family 6 Model 60 Stepping 3, GenuineIntel
byteorder : little
LC_ALL : None
LANG : None
LOCALE : None.None

pandas : 0.25.3
numpy : 1.17.4
pytz : 2019.3
dateutil : 2.8.0
pip : 19.3.1
setuptools : 42.0.2.post20191203
Cython : 0.29.14
pytest : 5.3.2
hypothesis : 4.54.2
sphinx : 2.3.0
blosc : None
feather : None
xlsxwriter : 1.2.6
lxml.etree : 4.4.2
html5lib : 1.0.1
pymysql : None
psycopg2 : None
jinja2 : 2.10.3
IPython : 7.10.2
pandas_datareader: None
bs4 : 4.8.1
bottleneck : 1.3.1
fastparquet : None
gcsfs : None
lxml.etree : 4.4.2
matplotlib : 3.1.1
numexpr : 2.7.0
odfpy : None
openpyxl : 3.0.2
pandas_gbq : None
pyarrow : None
pytables : None
s3fs : 0.4.0
scipy : 1.3.2
sqlalchemy : 1.3.11
tables : 3.6.1
xarray : None
xlrd : 1.2.0
xlwt : 1.3.0
xlsxwriter : 1.2.6