VIS/ENH Hexbin plot by TomAugspurger · Pull Request #5478 · pandas-dev/pandas (original) (raw)

Oh, also, I hate the matplotlib argument names for the value (C) and reduction function reduce_C_function.

A brief overview:

by default (C=None), compute a histogram in each bin. Essentially each (x, y) point in a bin has a value of 1, and the reduction function is sum (i.e. count since every value is 1.)

If you specify C (another column of df for pandas), then each point is a 3 tuple (x, y, c): (x, y) still specify what bin you end up in, and c determines the value passed to the reduction function reduce_C_function

What's the policy on respecting other libraries' function arguments. Other pandas plotting functions seem to use colormap instead of matplotlib's cmap.

If we're willing to change, I'd suggest renaming C to value and reduce_C_function to reduce or reduce_func

Here's an example to play with if that helps:

df = DataFrame({"A": np.random.uniform(size=20), "B": np.random.uniform(size=20), "C": np.arange(20) + np.random.uniform(size=20)})

ax = df.plot(kind='hexbin', x='A', y='B', gridsize=10) # histogram by default ax = df.plot(kind='hexbin', x='A', y='B', C='C', reduce_C_function=np.max)

So in the second case the color of a bin will be determined by the maximum value of C (from column C in df) in that bin.