BUG: apply() fails on some value types · Issue #34529 · pandas-dev/pandas (original) (raw)

We have some existing code that manipulates data that is decoded into numpy arrays (by a C powered backend). This code has stopped working.

I've tried to strip it down to a reduced case

import pandas as pd
import numpy as np
df = pd.DataFrame(np.array([b'abcd', b'efgh']), columns=['col'])
df.apply(lambda x: x.astype('object'))

This fails with an error inside an internal function of apply:

ValueError                                Traceback (most recent call last)
<ipython-input-88-a5fa9cabd101> in <module>
----> 1 df.apply(lambda x: x.astype('object'))

~/local/pkg/miniconda3/envs/odc/lib/python3.8/site-packages/pandas/core/frame.py in apply(self, func, axis, raw, result_type, args, **kwds)
   6876             kwds=kwds,
   6877         )
-> 6878         return op.get_result()
   6879 
   6880     def applymap(self, func) -> "DataFrame":

~/local/pkg/miniconda3/envs/odc/lib/python3.8/site-packages/pandas/core/apply.py in get_result(self)
    184             return self.apply_raw()
    185 
--> 186         return self.apply_standard()
    187 
    188     def apply_empty_result(self):

~/local/pkg/miniconda3/envs/odc/lib/python3.8/site-packages/pandas/core/apply.py in apply_standard(self)
    293 
    294             try:
--> 295                 result = libreduction.compute_reduction(
    296                     values, self.f, axis=self.axis, dummy=dummy, labels=labels
    297                 )

pandas/_libs/reduction.pyx in pandas._libs.reduction.compute_reduction()

pandas/_libs/reduction.pyx in pandas._libs.reduction.Reducer.__init__()

pandas/_libs/reduction.pyx in pandas._libs.reduction.Reducer._check_dummy()

ValueError: Dummy array must be same dtype

If we use a different dtype, then it works.

df = pd.DataFrame(np.array(['abcd', 'efgh']), columns=['col'])
df.apply(lambda x: x.astype('object'))
print(df)

which gives the expected result