scatter_matrix color vs c · Issue #14855 · pandas-dev/pandas (original) (raw)

I reported the issue below to matplotlib here, but they have requested to report it here as it appears (to the matplotlib folks) related to pandas.

The following code gives error, but works replacing color= with c=

I suspect is either a bug, or some piece of documentation missing?
(matplotlib folks think it is a pandas / scatter_matrix bug - see below)

I'm using:
matplotlib 1.5.3
pandas 0.19.1
python 3.5

This is a sample code:

import pandas as pd
import numpy as np
from sklearn.datasets import load_iris
%matplotlib inline

iris = load_iris()
colors = list()
palette = {0: "red", 1: "green", 2: "blue"}

for c in np.nditer(iris.target): colors.append(palette[int(c)])
    # using the palette dictionary, we convert
    # each numeric class into a color string
dataframe = pd.DataFrame(iris.data,
columns=iris.feature_names)
scatterplot = pd.scatter_matrix(dataframe, alpha=0.3,
figsize=(10, 10), diagonal='hist', color=colors, marker='o', grid=True)

This is the error trace:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-4-f3d30057e124> in <module>()
     14 columns=iris.feature_names)
     15 scatterplot = pd.scatter_matrix(dataframe, alpha=0.3,
---> 16 figsize=(10, 10), diagonal='hist', color=colors, marker='o', grid=True)

/Users/e/anaconda/lib/python3.5/site-packages/pandas/tools/plotting.py in scatter_matrix(frame, alpha, figsize, ax, grid, diagonal, marker, density_kwds, hist_kwds, range_padding, **kwds)
    393 
    394                 ax.scatter(df[b][common], df[a][common],
--> 395                            marker=marker, alpha=alpha, **kwds)
    396 
    397                 ax.set_xlim(boundaries_list[j])

/Users/e/anaconda/lib/python3.5/site-packages/matplotlib/__init__.py in inner(ax, *args, **kwargs)
   1817                     warnings.warn(msg % (label_namer, func.__name__),
   1818                                   RuntimeWarning, stacklevel=2)
-> 1819             return func(ax, *args, **kwargs)
   1820         pre_doc = inner.__doc__
   1821         if pre_doc is None:

/Users/e/anaconda/lib/python3.5/site-packages/matplotlib/axes/_axes.py in scatter(self, x, y, s, c, marker, cmap, norm, vmin, vmax, alpha, linewidths, verts, edgecolors, **kwargs)
   3787                 facecolors = co
   3788             if c is not None:
-> 3789                 raise ValueError("Supply a 'c' kwarg or a 'color' kwarg"
   3790                                  " but not both; they differ but"
   3791                                  " their functionalities overlap.")

ValueError: Supply a 'c' kwarg or a 'color' kwarg but not both; they differ but their functionalities overlap.

According to @goyodiaz:

It looks like a pandas issue. pandas.scatter_matrix always pass c to scatter but the caller can optionally pass more keywords arguments, including color. If the caller pass only color then scatter_matrix will pass both c and color and matplotlib will raise with a message that is confusing to the user because they did not pass c. Probably scatter_matrix should either raise with a clearer message or drop the c keyword argument when the user pass color

.