PERF: Timestamp/Timedelta constructors when passed a Timestamp/Timedelta · Issue #30543 · pandas-dev/pandas (original) (raw)
The Timestamp
constructor's performance could be improved when an existing Timestamp
object is passed to it via an isinstance
-like check:
In [1]: import pandas as pd; pd.version Out[1]: '0.26.0.dev0+1469.ge817ffff3'
In [2]: ts = pd.Timestamp('2020')
In [3]: def timestamp_isinstance_shortcircuit(ts): ...: if isinstance(ts, pd.Timestamp): ...: return ts ...: return pd.Timestamp(ts) ...:
In [4]: %timeit pd.Timestamp(ts) 849 ns ± 13.2 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [5]: %timeit timestamp_isinstance_shortcircuit(ts) 121 ns ± 0.279 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
Some care is needed in the constructor to check if other arguments have been passed, e.g. tz
, where we wouldn't be able to directly return the Timestamp
object.
Similar story for the Timedelta
constructor (should be done in a separate PR):
In [6]: td = pd.Timedelta('1 day')
In [7]: def timedelta_isinstance_shortcircuit(td): ...: if isinstance(td, pd.Timedelta): ...: return td ...: return pd.Timedelta(td) ...:
In [8]: %timeit pd.Timedelta(td) 800 ns ± 1.35 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [9]: %timeit timedelta_isinstance_shortcircuit(td) 120 ns ± 0.269 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
xref #30520