BUG: df.plot.scatter() with norm keyword fails with 'multiple values for keyword argument' TypeError · Issue #45809 · pandas-dev/pandas (original) (raw)

Pandas version checks

Reproducible Example

import pandas as pd import matplotlib

df = pd.DataFrame([{'1': 1.0, '2': 2.0, '3': 100}, {'1': 1.2, '2': 1.8, '3': 10}])

df.plot.scatter(x="1", y="2", c="3", norm=matplotlib.colors.LogNorm())

Issue Description

The code above shows creating a DataFrame with three numeric columns, and then plotting the dataframe as a 2D scatter with a color axis based on one of the columns.

When running this code the following traceback is generated:

Traceback (most recent call last):
File "C:\ansysdev\pandas-bug.py", line 7, in
df.plot.scatter(x="1", y="2", c="3", norm=matplotlib.colors.LogNorm())
File "C:\ansysdev\pandas-venv\lib\site-packages\pandas\plotting_core.py", line 1669, in scatter
return self(kind="scatter", x=x, y=y, s=s, c=c, **kwargs)
File "C:\ansysdev\pandas-venv\lib\site-packages\pandas\plotting_core.py", line 917, in call
return plot_backend.plot(data, x=x, y=y, kind=kind, **kwargs)
File "C:\ansysdev\pandas-venv\lib\site-packages\pandas\plotting_matplotlib_ init_.py", line 71, in plot
plot_obj.generate()
File "C:\ansysdev\pandas-venv\lib\site-packages\pandas\plotting_matplotlib\core.py", line 329, in generate
self._make_plot()
File "C:\ansysdev\pandas-venv\lib\site-packages\pandas\plotting_matplotlib\core.py", line 1119, in _make_plot
scatter = ax.scatter(
TypeError: matplotlib.axes._axes.Axes.scatter() got multiple values for keyword argument 'norm'

Expected Behavior

I believe the issue was caused by this PR: #34293, specifically in this code block:

    if color_by_categorical:
        from matplotlib import colors

        n_cats = len(self.data[c].cat.categories)
        cmap = colors.ListedColormap([cmap(i) for i in range(cmap.N)])
        bounds = np.linspace(0, n_cats, n_cats + 1)
        norm = colors.BoundaryNorm(bounds, cmap.N)
    else:
        norm = None

Since color_by_categorical is None, norm is being set to None. However, norm is also present in `self.kwds, which is where the 'multiple values for keyword argument' TypeError is coming from. The else branch should probably do something like:

norm = self.kwds.pop('norm', None)

This would result in the user's choice of norm being respected.

If the maintainers agree with this change, I'm happy to put together a PR with this in.

Installed Versions

INSTALLED VERSIONS

commit : 7651c08
python : 3.10.0.final.0
python-bits : 64
OS : Windows
OS-release : 10
Version : 10.0.19042
machine : AMD64
processor : Intel64 Family 6 Model 85 Stepping 7, GenuineIntel
byteorder : little
LC_ALL : None
LANG : None
LOCALE : English_United States.1252

pandas : 1.5.0.dev0+279.g7651c08230
numpy : 1.22.1
pytz : 2021.3
dateutil : 2.8.2
pip : 21.2.3
setuptools : 57.4.0
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : None
IPython : None
pandas_datareader: None
bs4 : None
bottleneck : None
fastparquet : None
fsspec : None
gcsfs : None
matplotlib : 3.5.1
numba : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pyreadstat : None
pyxlsb : None
s3fs : None
scipy : None
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
zstandard : None