Performance decrease of groupby.first for datetime64 in 0.14 · Issue #7555 · pandas-dev/pandas (original) (raw)
Navigation Menu
- Explore
- Pricing
Provide feedback
Saved searches
Use saved searches to filter your results more quickly
Appearance settings
Description
I have seen a dramatic (around 100x) slowdown of groupby.first
method for datetime64
type after updating to 0.14.
Here is an example:
from time import time d = pd.DataFrame({'a': pd.date_range('1/1/2011', periods=100000, freq='s'), 'b': range(100000)}) t = time() d.groupby('b').first() print(time() - t)
Result:
Same for int
type
from time import time d = pd.DataFrame({'a': range(100000), 'b': range(100000)}) t = time() d.groupby('b').first() print(time() - t)
Result: