PERF: significant speedups in tz-aware operations by qwhelan · Pull Request #24491 · pandas-dev/pandas (original) (raw)

Operations involving tz-aware data currently incur a pretty substantial penalty:

[ 93.04%] ··· timeseries.DatetimeAccessor.time_dt_accessor_year                                                                                                           ok
[ 93.04%] ··· ============ =============
                   t
              ------------ -------------
                  None      2.41±0.07ms
             **US/Eastern     150±2ms**
                  UTC       2.64±0.07ms
                tzutc()     2.70±0.06ms
              ============ =============

[ 93.19%] ··· timeseries.DatetimeIndex.time_add_timedelta                                                                                                                 ok
[ 93.19%] ··· ============ ============
               index_type
              ------------ ------------
                  dst          n/a
                repeated       n/a
              **tz_aware     305±7ms**
                tz_naive    3.38±0.2ms
              ============ ============

This PR improves the performance of tz-aware operations to near that of tz-naive ones through a couple approaches:

Eliminate a duplicative validation check by setting _freq directly
- freq.setter calls a validation check that compares the input value against to_offset(self.inferred_freq), which is exactly what we are passing
- For tz-aware data, .inferred_freq requires converting the entire array to the appropriate tz. The time to do so dominates the runtime for our benchmark. Simply eliminating 1 of 2 calls cuts runtime by 50%.
Improve the performance of pandas._libs.tslibs.conversion functions by batching searchsorted calls rather than doing piecewise.
- In theory, both approaches should be O(n log(k)). However, call overhead appears to be substantially larger than the actual search time for the N of our existing benchmark.
- The existing comment suggests operating piecewise is beneficial in the presence of lots of iNaTs, but this ignores the fact that searchsorted has some optimizations for when the needle array is also sorted (which allows it to incrementally shrink the search region).
Minor speedup due to minimizing is_tzlocal()

Here's the same comparison with the PR:

[ 26.38%] ··· timeseries.DatetimeAccessor.time_dt_accessor_year                                                                                                           ok
[ 26.38%] ··· ============ =============
                   t
              ------------ -------------
                  None      2.31±0.06ms
               US/Eastern   3.70±0.08ms
                  UTC       2.49±0.03ms
                tzutc()      2.43±0.1ms
              ============ =============

[ 26.52%] ··· timeseries.DatetimeIndex.time_add_timedelta                                                                                                                 ok
[ 26.52%] ··· ============ =============
               index_type
              ------------ -------------
                  dst           n/a
                repeated        n/a
                tz_aware     4.01±0.2ms
                tz_naive    2.33±0.04ms
              ============ =============

And asv output:

$ asv compare upstream/master HEAD -s --sort ratio --only-changed
       before           after         ratio
     [02a97c0a]       [5a5ed18b]
     <tz_aware_op_speedup~2>       <tz_aware_op_speedup>
-      3.38±0.2ms      2.33±0.04ms     0.69  timeseries.DatetimeIndex.time_add_timedelta('tz_naive')
-      21.0±0.2ms       14.4±0.3ms     0.68  inference.DateInferOps.time_timedelta_plus_datetime
-        700±80ns         366±20ns     0.52  timestamp.TimestampProperties.time_dayofweek(<UTC>, 'B')
-         181±2ms       30.6±0.7ms     0.17  timeseries.DatetimeAccessor.time_dt_accessor_time('US/Eastern')
-        180±10ms       29.1±0.9ms     0.16  timeseries.DatetimeAccessor.time_dt_accessor_date('US/Eastern')
-         178±3ms       26.7±0.6ms     0.15  timeseries.DatetimeAccessor.time_dt_accessor_day_name('US/Eastern')
-         179±2ms       26.3±0.4ms     0.15  timeseries.DatetimeIndex.time_to_time('tz_aware')
-         172±4ms       24.8±0.7ms     0.14  timeseries.DatetimeAccessor.time_dt_accessor_month_name('US/Eastern')
-         175±5ms       24.1±0.9ms     0.14  timeseries.DatetimeIndex.time_to_date('tz_aware')
-         150±2ms      3.70±0.08ms     0.02  timeseries.DatetimeAccessor.time_dt_accessor_year('US/Eastern')
-        95.3±5ms      2.02±0.07ms     0.02  indexing.NonNumericSeriesIndexing.time_getitem_label_slice('datetime', 'nonunique_monotonic_inc')
-         149±2ms      2.85±0.02ms     0.02  timeseries.DatetimeIndex.time_timeseries_is_month_start('tz_aware')
-         305±7ms       4.01±0.2ms     0.01  timeseries.DatetimeIndex.time_add_timedelta('tz_aware')
-        95.3±3ms          356±9μs     0.00  indexing.NonNumericSeriesIndexing.time_get_value('datetime', 'nonunique_monotonic_inc')
       before           after         ratio
     [02a97c0a]       [5a5ed18b]
     <tz_aware_op_speedup~2>       <tz_aware_op_speedup>
+         323±6ms          386±9ms     1.19  timeseries.ToDatetimeISO8601.time_iso8601_tz_spaceformat
+         165±2ms          191±4ms     1.15  timeseries.ToDatetimeCache.time_dup_string_tzoffset_dates(False)
+         151±5μs          170±2μs     1.13  indexing.NonNumericSeriesIndexing.time_getitem_scalar('datetime', 'nonunique_monotonic_inc')

closes #xxxx
tests added / passed
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry