BUG: freq setter for DatetimeIndex/TimedeltaIndex/PeriodIndex is poorly supported · Issue #20678 · pandas-dev/pandas (original) (raw)
1. Setting to a frequency string alias fails for all datetimelike indexes (same error for all three):
In [1]: import pandas as pd
In [2]: dti = pd.DatetimeIndex(['20170101', '20170102', '20170103'])
In [3]: dti Out[3]: DatetimeIndex(['2017-01-01', '2017-01-02', '2017-01-03'], dtype='datetime64[ns]', freq=None)
In [4]: dti.freq = 'D'
In [5]: dti Out[5]: --------------------------------------------------------------------------- AttributeError: 'str' object has no attribute 'freqstr'
2. DatetimeIndex
and TimedeltaIndex
allow setting to invalid frequencies:
In [6]: dti = pd.DatetimeIndex(['20170101', '20170102', '20170103'])
In [7]: dti.freq = pd.offsets.Day(2)
In [8]: dti Out[8]: DatetimeIndex(['2017-01-01', '2017-01-02', '2017-01-03'], dtype='datetime64[ns]', freq='2D')
Note that trying to construct this from scratch fails, as expected:
In [9]: dti = pd.DatetimeIndex(['20170101', '20170102', '20170103'], freq='2D')
ValueError: Inferred frequency D from passed dates does not conform to passed frequency 2D
3. PeriodIndex
gives nonsensical output when setting with an offset:
In [10]: pi = pd.PeriodIndex(['2018Q1', '2018Q2', '2018Q3'], freq='Q')
In [11]: pi Out[11]: PeriodIndex(['2018Q1', '2018Q2', '2018Q3'], dtype='period[Q-DEC]', freq='Q-DEC')
In [12]: pi.freq = pd.offsets.Day()
In [13]: pi Out[13]: PeriodIndex(['1970-07-12', '1970-07-13', '1970-07-14'], dtype='period[Q-DEC]', freq='D')
In addition to the nonsensical dates, note the incompatibility in [13]
between dtype
and freq
.
Expected Output
I don't know that setting freq
for a PeriodIndex
should be supported: there's some ambiguity in regards aligning to the start/end of a period, and there's also the PeriodIndex.asfreq
method that gives more control over this.
If setting freq
is disallowed for PeriodIndex
, should setting also be disallowed for DatetimeIndex
and TimedeltaIndex
in order to remain consistent? If so, should there then be an analogous asfreq
for these two?
My thoughts on the three issues:
- String aliases should be coerced as appropriate (be it via setting or a
asfreq
method). - This should raise with a similar message as the constructor (be it via setting or a
asfreq
method). - This should raise, potentially with a message indicating to use
asfreq
. Or if we choose to allow setting, this should be consistent with the defaultasfreq
options.
Output of pd.show_versions()
INSTALLED VERSIONS
commit: fa231e8
python: 3.6.1.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 78 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None
pandas: 0.23.0.dev0+740.gfa231e8
pytest: 3.1.2
pip: 9.0.1
setuptools: 39.0.1
Cython: 0.25.2
numpy: 1.13.3
scipy: 1.0.0
pyarrow: 0.6.0
xarray: 0.9.6
IPython: 6.1.0
sphinx: 1.5.6
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2017.2
blosc: None
bottleneck: None
tables: 3.4.2
numexpr: 2.6.4
feather: 0.4.0
matplotlib: 2.0.2
openpyxl: 2.4.8
xlrd: 1.1.0
xlwt: 1.3.0
xlsxwriter: 0.9.8
lxml: 3.8.0
bs4: None
html5lib: 0.999
sqlalchemy: 1.1.13
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
fastparquet: 0.1.0
pandas_gbq: None
pandas_datareader: None