ENH: better dtype inference when doing DataFrame reductions by topper-123 · Pull Request #52788 · pandas-dev/pandas (original) (raw)
I've actually just this morning made a new version using _reduce only (i.e. scrapping _reduce_and_wrap)`. I prefer this new version (because we now only have one reduction methods instead of two), but if the other is preferred, it is easy to revert back.
I maintain backward compat in this new version by:
- making the
keepdimparameter keyword only - only calling it with
keepdims=Trueif the_reducemethod signature has a parameter named "keepdims" - if the
_reducemethod signature does not have a parameter named "keepdims",_reducegets called without supplying thekeepdimsparameter, we emit aFutureWarningand take care of wrapping the (scalar) reduction result in a ndarray before passing it on.
This is possible because _reduce_and_wrap was actually only called inside the blk_func inside DataFrame._reduce, so by doing some introspection there we can keep backward compat. See new version:
| def blk_func(values, axis: Axis = 1): |
| if isinstance(values, ExtensionArray): |
| if not is_1d_only_ea_dtype(values.dtype) and not isinstance( |
| self._mgr, ArrayManager |
| ): |
| return values._reduce(name, axis=1, skipna=skipna, **kwds) |
| sign = signature(values._reduce) |
| if "keepdims" in sign.parameters: |
| return values._reduce(name, skipna=skipna, keepdims=True, **kwds) |
| else: |
| warnings.warn( |
| f"{type(values)}._reduce will require a `keepdims` parameter " |
| "in the future", |
| FutureWarning, |
| stacklevel=find_stack_level(), |
| ) |
| result = values._reduce(name, skipna=skipna, kwargs=kwds) |
| return np.array([result]) |
| else: |
| return op(values, axis=axis, skipna=skipna, **kwds) |
.
Notice especially the FutureWarning starting on line 10885. This will allow us to not require keepdims now, even though keepdims is in the signature of ExtensionArray._reduce. In v3.0, we will drop the signature checking and only call values._reduce with keepdims=True, i.e. it will fail without a keepdims parameter in v3.0.
Check out the test_reduction_without_keepdims test in pandas/tests/extension/decimal/test_decimal.py for a test of what happens when extensionarrays don't have a keepdim parameter in their _reduce method.
Thoughts? I prefer this new version, but it's is easy to revert back if needed.