PERF: enable caching on expensive offsets (original) (raw)

xref #16463

We have a set of seemingly unused but also seemingly functional caching logic for generating ranges of offsets. Example

In [63]: BM = pd.offsets.BusinessMonthEnd()

In [64]: %timeit pd.date_range('1990-01-01', '2020-01-01', freq=BM) 18.1 ms ± 366 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [65]: BM._cacheable Out[65]: False

In [66]: BM._cacheable = True

In [67]: %timeit pd.date_range('1990-01-01', '2020-01-01', freq=BM) 686 µs ± 17.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [68]: a = pd.date_range('1990-01-01', '2020-01-01', freq='BM')

In [69]: b = pd.date_range('1990-01-01', '2020-01-01', freq=BM)

In [70]: (a == b).all() Out[70]: True

Possibly should enable or expose an API for this? Especially for slower offsets.

cc @jbrockmendel