Shortcut functions in transform are not grouped · Issue #19354 · pandas-dev/pandas (original) (raw)
$ ipython3
Python 3.6.3 (default, Oct 3 2017, 21:45:48)
Type 'copyright', 'credits' or 'license' for more information
IPython 6.2.1 -- An enhanced Interactive Python. Type '?' for help.
In [1]: import pandas as pd
In [2]: df = pd.DataFrame({'A': [1, 1, 2], 'B': [1, 2, 3]})
In [3]: df.groupby('A').transform('rank')
Out[3]:
B
0 1
1 1
2 2
In [4]: df.groupby('A').transform(lambda x: x.rank())
Out[4]:
B
0 1.0
1 2.0
2 1.0
Problem description
It seems like the functions provided through shorthand strings, such as 'rank', do not obey the groupings of data. For example, one would expect Out[3] and Out[4] to be the same in the code above. Instead, it seems like using .transform('rank')
doesn't actually obey the grouping and just ranks independently. I've reproduced this with larger dataframes as well.