PERF: DataFrame groupby with fast transform · Issue #12737 · pandas-dev/pandas (original) (raw)

@jreback

from SO

import pandas as pd
import numpy as np

df = pd.DataFrame({'group': np.repeat(np.arange(1000), 10),
                   'B': np.nan,
                   'C': np.nan})

df.ix[4::10, 'B':'C'] = 5 # every 4th row of a group is non-null

df.groupby('group').transform('first')

This is then iterating over groups. Last I can see this was changed is: here. My recollection is that this was ONLY supposed to hit in a special case, and the general case is simply a repeat based on the indices.

This seems to be hitting in all cases makes transform back to super SLOW.