BUG: fix Series.apply(..., by_row), v2. by topper-123 · Pull Request #53601 · pandas-dev/pandas (original) (raw)

I was thinking the removal of by_row=True would also apply in the Series case, is that right? If not, a lot of my requests below are invalid.

Unfortunately not. For Series.apply we have:

ser = pd.Series([1, np.nan, 2]) ser.apply(np.sum, by_row=True) 0 1.0 1 NaN 2 2.0 dtype: float64 ser.apply(np.sum, by_row="compat") 3.0

for Series.apply , by_row="compat" ends up calling SeriesApply.apply_compat, when given a dict- or listlike. For example:

>>> ser.apply(np.sum)  # by_row=True is implicit here
0    1.0
1    NaN
2    2.0
dtype: float64
>>> ser.apply([np.sum])  # for each func in the list, call ser.apply(np.sum, by_row="compat")
sum    3.0
dtype: float64

These two cases can't be combined into one parameter option, so I don't see another way forward here myself.

I'll look into your detailed questions later today.