Make pivot values work with most iterables by cscanlin · Pull Request #12017 · pandas-dev/pandas (original) (raw)
The problem is that right now the only objects that can be passed in as values to a pivot table are lists and tuples. I would like to expand this so that other iterable objects like generators and (in python 3) dictionary methods like .keys()
can be passed in as well. My use case is something like this:
import pandas as pd
import numpy as np
df =pd.DataFrame({'A': ['foo', 'foo', 'foo', 'foo',
'bar', 'bar', 'bar', 'bar',
'foo', 'foo', 'foo'],
'B': np.random.randn(11),
'C': np.random.randn(11)})
print(df)
aggs = {'B':np.sum, 'C':np.mean}
pivot_output = pd.pivot_table(
df,
index=['A'],
values=aggs.keys(),
aggfunc=aggs,
)
print(pivot_output)
This works in Python 2, but returns an error in Python 3 because .keys()
returns a dict_keys
object instead of a list of keys. This forces you to explicity change it to a list before passing it through which feels clunky: values=list(aggs.keys()),
@MaximilianR pointed out that using is_list_like
is probably preferable in the source, instead of the conditional I wrote. I am also wondering if its neccessary to expicitly convert other iterable objects to a list, or if they would work appropriately as is. I need to test more.
As far as unit tests go, I'm still learning and I'm not sure how the best way to write those for this would be, but I'll continue looking into it.
I may have jumped the gun on this PR a bit, so please let me know if theres a better way I should proceed.