BUG: Resampling PeriodIndex-ed to multiple of frequencies not working as expected · Issue #15944 · pandas-dev/pandas (original) (raw)
Code Sample, a copy-pastable example if possible
In [1]: import numpy as np In [2]: import pandas as pd
In [3]: s = pd.Series([2017, 2018], index=pd.period_range('2017', freq='A', periods=2))
In [4]: s.resample('2Q', kind='period').ffill() Warning: multiple of frequency -> timestamps Out[4]: 2017-03-31 2017 2017-09-30 2017 2018-03-31 2018 Freq: 2Q-DEC, dtype: int64
In [7]: s2 = pd.Series(np.arange(12), index=pd.period_range('2017-01', freq='M', periods=12)) In [8]: s2.resample('2Q', kind='period').mean() Warning: multiple of frequency -> timestamps Out[8]: 2017-03-31 1.0 2017-09-30 5.5 2018-03-31 10.0 Freq: 2Q-DEC, dtype: float64
To compare with, results for resampling to base frequency (no multiples) returning PeriodIndex-ed and covering the full original time span:
In [5]: s.resample('Q', kind='period').ffill() Out[5]: 2017Q1 2017 2017Q2 2017 2017Q3 2017 2017Q4 2017 2018Q1 2018 2018Q2 2018 2018Q3 2018 2018Q4 2018 Freq: Q-DEC, dtype: int64
In [9]: s2.resample('Q', kind='period').mean() Out[9]: 2017Q1 1 2017Q2 4 2017Q3 7 2017Q4 10 Freq: Q-DEC, dtype: int64
Problem description
- I'd expect resampling PeriodIndex-ed series/dataframes would return a PeriodIndex-ed result by default, even more when given
kind='period'
. - Moreover, I'd expect the original full time span to be covered by the resampling result (upsampling A->2Q would return 2 periods per year, downsampling M->2Q would return 2 periods per 12 months).
As indicated by the warning message, resampling to multiple of frequencies falls back to timestamp-based resampling. Both work fine when resampling to a "base" frequency without any multiple.
Expected Output
In [4]: s.resample('2Q', kind='period').ffill() Out[4]: 2017Q1 2017 2017Q3 2017 2018Q1 2018 2018Q3 2018 Freq: 2Q-DEC, dtype: int64
In [8]: s2.resample('2Q', kind='period').mean() Out[8]: 2017Q1 2.5 2017Q3 8.5 Freq: 2Q-DEC, dtype: float64
Output of pd.show_versions()
INSTALLED VERSIONS ------------------ commit: None python: 3.6.1.final.0 python-bits: 64 OS: Darwin OS-release: 15.6.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: de_DE.UTF-8 LOCALE: de_DE.UTF-8
pandas: 0.19.2
nose: None
pip: 9.0.1
setuptools: 33.1.1.post20170320
Cython: None
numpy: 1.12.1
scipy: 0.19.0
statsmodels: 0.8.0
xarray: None
IPython: 5.3.0
sphinx: None
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: 1.2.0
tables: None
numexpr: 2.6.2
matplotlib: 2.0.0
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: 4.5.3
html5lib: 0.999
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.9.5
boto: None
pandas_datareader: None