Breaking examples due to resample refactor · Issue #12448 · pandas-dev/pandas (original) (raw)
While using master a bit, I discovered some more cases where the new resample API breaks things:
- Plotting.
.plot
is a dedicated groupby/resample method (which adds each group individually to the plot), while I think it is a very common idiom to quickly resample your timeseries and plot it with (old API) egs.resample('D').plot()
.
Example with master:
In [1]: s = pd.Series(np.random.randn(60), index=date_range('2016-01-01', periods=60, freq='1min'))
In [3]: s.resample('15min').plot()
Out[3]:
2016-01-01 00:00:00 Axes(0.125,0.1;0.775x0.8)
2016-01-01 00:15:00 Axes(0.125,0.1;0.775x0.8)
2016-01-01 00:30:00 Axes(0.125,0.1;0.775x0.8)
2016-01-01 00:45:00 Axes(0.125,0.1;0.775x0.8)
Freq: 15T, dtype: object
while previously it would just have given you one continuous line.
This one can be solved I think by special casing plot
for resample
(not have it a special groupby-like method, but let it warn and pass the the resample().mean()
result to Series.plot()
like the 'deprecated_valids')
- When you previously called a method on the
resample
result that is also a valid Resampler method now. Egs.resample(freq).min()
would previously have given you the "minimum daily average" while now it will give you the "minimum per day".
This one is more difficult/impossible to solve I think? As you could detect that case if you know it is old code, but cannot distinguish it from perfectly valid code with the new API. If we can't solve it, I think it deserves some mention in the whatsnew explanation. - Using
resample
on agroupby
object (xref Resampling converts int to float, but only in group by #12202). Using the example of that issue, with 0.17.1 you get:
In [1]: df = pd.DataFrame({'date': pd.date_range(start='2016-01-01', periods=4,
freq='W'),
...: 'group': [1, 1, 2, 2],
...: 'val': [5, 6, 7, 8]})
In [2]: df.set_index('date', inplace=True)
In [3]: df
Out[3]:
group val
date
2016-01-03 1 5
2016-01-10 1 6
2016-01-17 2 7
2016-01-24 2 8
In [4]: df.groupby('group').resample('1D', fill_method='ffill')
Out[4]:
val
group date
1 2016-01-03 5
2016-01-04 5
2016-01-05 5
2016-01-06 5
2016-01-07 5
2016-01-08 5
2016-01-09 5
2016-01-10 6
2 2016-01-17 7
2016-01-18 7
2016-01-19 7
2016-01-20 7
2016-01-21 7
2016-01-22 7
2016-01-23 7
2016-01-24 8
In [5]: pd.__version__
Out[5]: u'0.17.1'
while with master you get:
In [29]: df.groupby('group').resample('1D', fill_method='ffill')
Out[29]: <pandas.core.groupby.DataFrameGroupBy object at 0x0000000009BA73C8>
which will give you different results/error with further operations on that. Also, this case does not raise any FutureWarning (which should, as the user should adapt the code to groupby().resample('D').ffill()
)