REGR: Assigning multiple new columns with loc fails when index is a MultiIndex · Issue #39147 · pandas-dev/pandas (original) (raw)

I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
(optional) I have confirmed this bug exists on the master branch of pandas.

Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.

Code Sample, a copy-pastable example

import pandas as pd

rows = [(1,2), (3,4)] initial_cols = ["a", "b"]

df = pd.DataFrame(42, index=pd.MultiIndex.from_tuples(rows), columns=initial_cols)

new_cols = ["c", "d"] df.loc[:, new_cols] = None

Problem description

When running the above code in pandas 1.1.5, you will get the following error:

  File "foo.py", line 11, in <module>
    df.loc[:, new_cols] = None
  File "/home/guido/anaconda3/envs/pandas_115_bug/lib/python3.7/site-packages/pandas/core/indexing.py", line 666, in __setitem__
    indexer = self._get_setitem_indexer(key)
  File "/home/guido/anaconda3/envs/pandas_115_bug/lib/python3.7/site-packages/pandas/core/indexing.py", line 609, in _get_setitem_indexer
    return self._convert_tuple(key, is_setter=True)
  File "/home/guido/anaconda3/envs/pandas_115_bug/lib/python3.7/site-packages/pandas/core/indexing.py", line 734, in _convert_tuple
    idx = self._convert_to_indexer(k, axis=i, is_setter=is_setter)
  File "/home/guido/anaconda3/envs/pandas_115_bug/lib/python3.7/site-packages/pandas/core/indexing.py", line 1198, in _convert_to_indexer
    return self._get_listlike_indexer(key, axis, raise_missing=True)[1]
  File "/home/guido/anaconda3/envs/pandas_115_bug/lib/python3.7/site-packages/pandas/core/indexing.py", line 1254, in _get_listlike_indexer
    self._validate_read_indexer(keyarr, indexer, axis, raise_missing=raise_missing)
  File "/home/guido/anaconda3/envs/pandas_115_bug/lib/python3.7/site-packages/pandas/core/indexing.py", line 1298, in _validate_read_indexer
    raise KeyError(f"None of [{key}] are in the [{axis_name}]")
KeyError: "None of [Index(['c', 'd'], dtype='object')] are in the [columns]"

Till version 1.1.4, the above code would result in the dataframe being extended with the two new columns c and d. With the release of pandas 1.1.5, the code results in the above error. It might be related to the fix for #37711

Single column assignments do still work, i.e. replacing the single assignment line with the following two separate assignment lines does work:

df.loc[:, "c"] = None df.loc[:, "d"] = None

The problem also only occurs whenever the index is a multiindex. If the index is not a multiindex, the error will not occur.

Expected Output

Expected output is that no error occurs and that the two columns c and d are added with values None for all rows.

Output of `pd.show_versions()`

INSTALLED VERSIONS

commit : b5958ee
python : 3.7.9.final.0
python-bits : 64
OS : Linux
OS-release : 5.8.0-33-generic
Version : #36-Ubuntu SMP Wed Dec 9 09:14:40 UTC 2020
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : en_US.UTF-8
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8

pandas : 1.1.5
numpy : 1.19.2
pytz : 2020.5
dateutil : 2.8.1
pip : 20.3.3
setuptools : 51.1.2.post20210112
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : None
IPython : None
pandas_datareader: None
bs4 : None
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pytables : None
pyxlsb : None
s3fs : None
scipy : None
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
numba : None

REGR: Assigning multiple new columns with loc fails when index is a MultiIndex · Issue #39147 · pandas-dev/pandas (original) (raw)

Code Sample, a copy-pastable example

Problem description

Expected Output

Output of pd.show_versions()

INSTALLED VERSIONS

Output of `pd.show_versions()`