BUG: Series.update() raises ValueError if dtype="string" · Issue #33980 · pandas-dev/pandas (original) (raw)


Code Sample, a copy-pastable example

import pandas as pd a = pd.Series(["a", None, "c"], dtype="string") b = pd.Series([None, "b", None], dtype="string") a.update(b)

results in:

Traceback (most recent call last):

File "", line 1, in a.update(b)

File "C:\tools\anaconda3\envs\Simple\lib\site-packages\pandas\core\series.py", line 2810, in update self._data = self._data.putmask(mask=mask, new=other, inplace=True)

File "C:\tools\anaconda3\envs\Simple\lib\site-packages\pandas\core\internals\managers.py", line 564, in putmask return self.apply("putmask", **kwargs)

File "C:\tools\anaconda3\envs\Simple\lib\site-packages\pandas\core\internals\managers.py", line 442, in apply applied = getattr(b, f)(**kwargs)

File "C:\tools\anaconda3\envs\Simple\lib\site-packages\pandas\core\internals\blocks.py", line 1676, in putmask new_values[mask] = new

File "C:\tools\anaconda3\envs\Simple\lib\site-packages\pandas\core\arrays\string_.py", line 248, in setitem super().setitem(key, value)

File "C:\tools\anaconda3\envs\Simple\lib\site-packages\pandas\core\arrays\numpy_.py", line 252, in setitem self._ndarray[key] = value

ValueError: NumPy boolean array indexing assignment cannot assign 3 input values to the 1 output values where the mask is true

Problem description

The example works if I leave off the dtype="string" (resulting in the implicit dtype object).
IMO update should work for all dtypes, not only the "old" ones.

a = pd.Series([1, None, 3], dtype="Int16") etc. also raises ValueError, while the same with dtype="float64"works.

It looks as if update doesn't work with the new nullable dtypes (the ones with pd.NA).

Expected Output

The expected result is that a.update(b) updates a without raising an exception, not only for object and float64, but also for string and Int16 etc..

Output of pd.show_versions()

INSTALLED VERSIONS

commit : None
python : 3.7.7.final.0
python-bits : 64
OS : Windows
OS-release : 10
machine : AMD64
processor : Intel64 Family 6 ..., GenuineIntel
...

pandas : 1.0.3
numpy : 1.18.1
...