BUG: df.plot.scatter() with norm
keyword fails with 'multiple values for keyword argument' TypeError · Issue #45809 · pandas-dev/pandas (original) (raw)
Pandas version checks
- I have checked that this issue has not already been reported.
- I have confirmed this bug exists on the latest version of pandas.
- I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import pandas as pd import matplotlib
df = pd.DataFrame([{'1': 1.0, '2': 2.0, '3': 100}, {'1': 1.2, '2': 1.8, '3': 10}])
df.plot.scatter(x="1", y="2", c="3", norm=matplotlib.colors.LogNorm())
Issue Description
The code above shows creating a DataFrame with three numeric columns, and then plotting the dataframe as a 2D scatter with a color axis based on one of the columns.
When running this code the following traceback is generated:
Traceback (most recent call last):
File "C:\ansysdev\pandas-bug.py", line 7, in
df.plot.scatter(x="1", y="2", c="3", norm=matplotlib.colors.LogNorm())
File "C:\ansysdev\pandas-venv\lib\site-packages\pandas\plotting_core.py", line 1669, in scatter
return self(kind="scatter", x=x, y=y, s=s, c=c, **kwargs)
File "C:\ansysdev\pandas-venv\lib\site-packages\pandas\plotting_core.py", line 917, in call
return plot_backend.plot(data, x=x, y=y, kind=kind, **kwargs)
File "C:\ansysdev\pandas-venv\lib\site-packages\pandas\plotting_matplotlib_ init_.py", line 71, in plot
plot_obj.generate()
File "C:\ansysdev\pandas-venv\lib\site-packages\pandas\plotting_matplotlib\core.py", line 329, in generate
self._make_plot()
File "C:\ansysdev\pandas-venv\lib\site-packages\pandas\plotting_matplotlib\core.py", line 1119, in _make_plot
scatter = ax.scatter(
TypeError: matplotlib.axes._axes.Axes.scatter() got multiple values for keyword argument 'norm'
Expected Behavior
I believe the issue was caused by this PR: #34293, specifically in this code block:
if color_by_categorical:
from matplotlib import colors
n_cats = len(self.data[c].cat.categories)
cmap = colors.ListedColormap([cmap(i) for i in range(cmap.N)])
bounds = np.linspace(0, n_cats, n_cats + 1)
norm = colors.BoundaryNorm(bounds, cmap.N)
else:
norm = None
Since color_by_categorical
is None, norm
is being set to None
. However, norm
is also present in `self.kwds, which is where the 'multiple values for keyword argument' TypeError is coming from. The else branch should probably do something like:
norm = self.kwds.pop('norm', None)
This would result in the user's choice of norm
being respected.
If the maintainers agree with this change, I'm happy to put together a PR with this in.
Installed Versions
INSTALLED VERSIONS
commit : 7651c08
python : 3.10.0.final.0
python-bits : 64
OS : Windows
OS-release : 10
Version : 10.0.19042
machine : AMD64
processor : Intel64 Family 6 Model 85 Stepping 7, GenuineIntel
byteorder : little
LC_ALL : None
LANG : None
LOCALE : English_United States.1252
pandas : 1.5.0.dev0+279.g7651c08230
numpy : 1.22.1
pytz : 2021.3
dateutil : 2.8.2
pip : 21.2.3
setuptools : 57.4.0
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : None
IPython : None
pandas_datareader: None
bs4 : None
bottleneck : None
fastparquet : None
fsspec : None
gcsfs : None
matplotlib : 3.5.1
numba : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pyreadstat : None
pyxlsb : None
s3fs : None
scipy : None
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
zstandard : None