Performance decrease of groupby.first for datetime64 in 0.14 (original) (raw)
I have seen a dramatic (around 100x) slowdown of groupby.first method for datetime64 type after updating to 0.14.
Here is an example:
from time import time d = pd.DataFrame({'a': pd.date_range('1/1/2011', periods=100000, freq='s'), 'b': range(100000)}) t = time() d.groupby('b').first() print(time() - t)
Result:
3.87164497375
Same for int type
from time import time d = pd.DataFrame({'a': range(100000), 'b': range(100000)}) t = time() d.groupby('b').first() print(time() - t)
Result:
0.0148799419403