PERF: Very slow printing of Series with DatetimeIndex · Issue #19764 · pandas-dev/pandas (original) (raw)
Consider this series with a datetime index:
s = pd.Series(np.random.randn(int(1e6)),
index=pd.date_range("2012-01-01", freq='s', periods=int(1e6)))
Showing this in the console or notebook (doing just s
) is very slow (noticable delay of 1 to 2 seconds in console, around 8 seconds in the notebook or JupyterLab ). But, on the other hand, in a plain Python console the display is instantly.
Further, on the other hand, if the series has no DatetimeIndex, the display appears instantly. Also, if it is a frame (doing s.to_frame()
) the display is instantly. And when doing print(s)
explicitly, the display is actually identical and also instantly. When it is another datetime-like index, eg s.to_period()
, the display is instant.
So it seems we are doing some work under the hood that is not needed. And it is somehow triggered by doing this in an IPython context, and specifically for DatetimeIndex.
This already seems to be present on some older versions of pandas as well.