PERF: using murmur hash for float64 khash-tables by realead · Pull Request #36729 · pandas-dev/pandas (original) (raw)
Now the comparison of different probing strategies:
- the base: double hashing (the state of this PR)
- linear probing (realead/pandas@1dc7af7)
- quadratic probing (i.e. khash 0.2.8) (realead/pandas@59600e5)
- combined linear probing + double hashing (realead/pandas@5e44098)
The different probing strategies are interesting for cases where comparisons are cheap (thus cache misses play a role), e.g. float64/int64/uint64. For heavier types (PyObject, strings), it plays only a role when there are problems with robustness in such a a way that look-up costs more than O(n).
Linear probing:
Linear probing will be really bad for types with bad first (and only) hash-function at least for some series. This is what we see (see the whole comparison at the end of comment):
- hash for
PyObjectis really bad, some test cases become more than 100 times slower - hash for int/uint is better, but there is still a "bad" series for which it becomes factor 20 slower
- the used murmur2 hash for float64 is quite strong, no problematic series in the test suite.
Quadratic probing
Supposed to be better than linear probing, as more robust. This is what we also can observe (all results at the end):
- PyObject is only 20 times slower
- problematic series for int64/uint64 are only 2 times slower
- otherwise gains are similar to linear
Combined probing:
Theoretically best of both worlds: robust as double hashing but as few cache misses as quadratic probing. It looks as expected: there are some cases with about 10% slow-down but many more cases with speed-ups, comparable with speed-ups from quadratic probing. Also for weak hashes nothing bad happens. The way it is implemented, combined probing can be switched off/on type-wise (it is only on for float64/(u)int64, in tests for PyObject no noteworthy changes were seen with combined probing).
Here are the all comparisons for combined probing:
before after ratio
[c852b2ee] [5e44098d]
+ 23.0±0.07ms 27.9±0.9ms 1.21 hash_functions.IsinWithArange.time_isin(<class 'numpy.int64'>, 1000, 2)
+ 1.93±0.2μs 2.29±0.2μs 1.18 index_cached_properties.IndexCache.time_inferred_type('DatetimeIndex')
+ 24.7±0.3ms 28.5±0.7ms 1.15 hash_functions.IsinWithArange.time_isin(<class 'numpy.uint64'>, 1000, 2)
+ 23.7±0.1ms 27.1±0.5ms 1.15 hash_functions.IsinWithArange.time_isin(<class 'numpy.float64'>, 8000, 2)
+ 1.83±0.1μs 2.08±0.2μs 1.14 index_cached_properties.IndexCache.time_values('DatetimeIndex')
+ 23.5±0.3ms 26.6±0.7ms 1.13 hash_functions.IsinWithArange.time_isin(<class 'numpy.float64'>, 8000, -2)
+ 23.5±0.2ms 26.4±0.5ms 1.12 hash_functions.IsinWithArange.time_isin(<class 'numpy.float64'>, 1000, -2)
+ 23.4±0.2ms 26.2±0.5ms 1.12 hash_functions.IsinWithArange.time_isin(<class 'numpy.float64'>, 2000, 2)
+ 12.6±0.1ms 14.1±0.5ms 1.12 indexing.IntervalIndexing.time_loc_list
+ 4.98±0.02ms 5.57±0.05ms 1.12 timeseries.ResampleSeries.time_resample('period', '1D', 'mean')
+ 23.2±0.2ms 25.9±0.8ms 1.11 hash_functions.IsinWithArange.time_isin(<class 'numpy.float64'>, 2000, -2)
+ 664±50ns 739±70ns 1.11 index_cached_properties.IndexCache.time_inferred_type('RangeIndex')
+ 1.18±0.02ms 1.32±0.01ms 1.11 series_methods.IsInForObjects.time_isin_short_series_long_values
+ 1.78±0.1μs 1.98±0.2μs 1.11 index_cached_properties.IndexCache.time_values('PeriodIndex')
+ 3.66±0.2μs 4.05±0.4μs 1.11 index_cached_properties.IndexCache.time_inferred_type('Float64Index')
+ 4.28±0.04ms 4.72±0.08ms 1.10 hash_functions.IsinWithRandomFloat.time_isin_outside(<class 'numpy.float64'>, 80000)
- 3.34±0.05μs 3.04±0.02μs 0.91 tslibs.tslib.TimeIntsToPydatetime.time_ints_to_pydatetime('time', 0, tzlocal())
- 16.9±1μs 15.4±0.2μs 0.91 dtypes.Dtypes.time_pandas_dtype('timedelta64')
- 1.05±0.2ms 949±60μs 0.91 ctors.SeriesConstructors.time_series_constructor(<function list_of_str at 0x7f8dd9d979d0>, False, 'int')
- 5.01±0.1ms 4.54±0.03ms 0.91 index_object.Indexing.time_get_loc_non_unique('Float')
- 3.08±0.3ms 2.79±0.07ms 0.91 rolling.Quantile.time_quantile('DataFrame', 10, 'int', 1, 'linear')
- 705±10ms 638±5ms 0.90 join_merge.I8Merge.time_i8merge('outer')
- 727±4ms 656±4ms 0.90 join_merge.I8Merge.time_i8merge('left')
- 3.19±0.05μs 2.88±0.02μs 0.90 tslibs.tslib.TimeIntsToPydatetime.time_ints_to_pydatetime('time', 0, datetime.timezone.utc)
- 2.79±0.03ms 2.52±0.02ms 0.90 indexing.NumericSeriesIndexing.time_loc_list_like(<class 'pandas.core.indexes.numeric.Float64Index'>, 'nonunique_monotonic_inc')
- 34.8±0.5ms 31.4±0.7ms 0.90 hash_functions.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.uint64'>, 20)
- 393±4ms 354±5ms 0.90 hash_functions.NumericSeriesIndexingShuffled.time_loc_slice(<class 'pandas.core.indexes.numeric.Int64Index'>, 5000000)
- 3.34±0.07μs 3.00±0.06μs 0.90 tslibs.tslib.TimeIntsToPydatetime.time_ints_to_pydatetime('time', 0, None)
- 311±3μs 279±3μs 0.90 hash_functions.IsinWithArangeSorted.time_isin(<class 'numpy.int64'>, 8000)
- 3.51±1ms 3.14±0.1ms 0.90 ctors.SeriesConstructors.time_series_constructor(<class 'list'>, False, 'int')
- 1.44±0.5ms 1.29±0.08ms 0.90 frame_ctor.FromRecords.time_frame_from_records_generator(1000)
- 113±1ms 101±0.8ms 0.90 hash_functions.IsinWithRandomFloat.time_isin_outside(<class 'numpy.float64'>, 900000)
- 72.1±2ms 64.4±0.9ms 0.89 hash_functions.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.float64'>, 20)
- 13.3±1μs 11.9±0.6μs 0.89 index_cached_properties.IndexCache.time_is_all_dates('TimedeltaIndex')
- 3.70±0.08μs 3.29±0.05μs 0.89 tslibs.tslib.TimeIntsToPydatetime.time_ints_to_pydatetime('date', 1, datetime.timezone.utc)
- 3.32±0.8ms 2.95±0.1ms 0.89 ctors.SeriesConstructors.time_series_constructor(<function arr_dict at 0x7f8dd9d97af0>, False, 'float')
- 88.9±1ms 79.0±1ms 0.89 multiindex_object.Duplicated.time_duplicated
- 394±2ms 349±4ms 0.89 hash_functions.NumericSeriesIndexingShuffled.time_loc_slice(<class 'pandas.core.indexes.numeric.UInt64Index'>, 5000000)
- 12.5±0.6ms 11.1±0.6ms 0.89 hash_functions.UniqueAndFactorizeArange.time_unique(10)
- 753±2ms 667±6ms 0.88 join_merge.I8Merge.time_i8merge('right')
- 4.20±0.1μs 3.72±0.06μs 0.88 tslibs.tslib.TimeIntsToPydatetime.time_ints_to_pydatetime('datetime', 1, None)
- 3.14±0.3ms 2.77±0.03ms 0.88 rolling.Quantile.time_quantile('DataFrame', 10, 'int', 0, 'higher')
- 42.4±2ms 37.4±0.4ms 0.88 rolling.Quantile.time_quantile('DataFrame', 10, 'int', 0.5, 'midpoint')
- 138±5ms 122±3ms 0.88 gil.ParallelGroupbyMethods.time_loop(8, 'last')
- 196±2ms 173±2ms 0.88 sparse.SparseSeriesToFrame.time_series_to_frame
- 35.1±4ms 30.9±0.8ms 0.88 gil.ParallelGroupbyMethods.time_loop(2, 'sum')
- 32.0±0.8ms 28.1±0.4ms 0.88 hash_functions.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.int64'>, 20)
- 12.4±0.6ms 10.9±0.4ms 0.88 hash_functions.UniqueAndFactorizeArange.time_unique(15)
- 137±10ms 120±1ms 0.88 gil.ParallelGroupbyMethods.time_loop(8, 'prod')
- 12.7±2ms 11.0±0.5ms 0.87 hash_functions.UniqueAndFactorizeArange.time_unique(11)
- 958±10μs 834±30μs 0.87 timeseries.DatetimeIndex.time_unique('repeated')
- 1.01±0.3ms 876±40μs 0.87 ctors.SeriesConstructors.time_series_constructor(<function list_of_str at 0x7f8dd9d979d0>, False, 'float')
- 29.1±2μs 25.3±2μs 0.87 index_cached_properties.IndexCache.time_engine('CategoricalIndex')
- 12.5±0.4ms 10.9±0.4ms 0.87 hash_functions.UniqueAndFactorizeArange.time_unique(4)
- 40.1±3ms 34.8±0.3ms 0.87 rolling.Quantile.time_quantile('DataFrame', 10, 'int', 0.5, 'higher')
- 1.04±0.2ms 903±50μs 0.87 ctors.SeriesConstructors.time_series_constructor(<function list_of_str at 0x7f8dd9d979d0>, True, 'float')
- 141±6ms 122±1ms 0.87 gil.ParallelGroupbyMethods.time_loop(8, 'min')
- 1.44±0.2ms 1.25±0.08ms 0.87 ctors.SeriesConstructors.time_series_constructor(<class 'list'>, False, 'float')
- 94.5±20μs 81.7±10μs 0.86 ctors.SeriesConstructors.time_series_constructor(<function no_change at 0x7f8dd9d97940>, False, 'float')
- 72.5±1ms 62.6±1ms 0.86 hash_functions.IsinWithRandomFloat.time_isin(<class 'numpy.float64'>, 750000)
- 3.64±1ms 3.13±0.1ms 0.86 ctors.SeriesConstructors.time_series_constructor(<function arr_dict at 0x7f8dd9d97af0>, True, 'float')
- 580±10ms 499±6ms 0.86 hash_functions.NumericSeriesIndexing.time_loc_slice(<class 'pandas.core.indexes.numeric.Float64Index'>, 5000000)
- 40.2±2ms 34.5±0.3ms 0.86 rolling.Quantile.time_quantile('DataFrame', 10, 'int', 0.5, 'nearest')
- 83.8±0.6ms 71.9±0.3ms 0.86 series_methods.IsInFloat64.time_isin_many_different
- 3.41±0.03ms 2.92±0.03ms 0.86 indexing.NumericSeriesIndexing.time_getitem_list_like(<class 'pandas.core.indexes.numeric.Float64Index'>, 'nonunique_monotonic_inc')
- 12.6±0.5ms 10.8±0.4ms 0.85 hash_functions.UniqueAndFactorizeArange.time_unique(8)
- 106±1ms 89.9±2ms 0.85 join_merge.Align.time_series_align_int64_index
- 580±10ms 492±4ms 0.85 hash_functions.NumericSeriesIndexingShuffled.time_loc_slice(<class 'pandas.core.indexes.numeric.Float64Index'>, 5000000)
- 41.2±3ms 34.7±0.2ms 0.84 rolling.Quantile.time_quantile('DataFrame', 10, 'int', 0.5, 'lower')
- 1.53±0.5ms 1.28±0.09ms 0.84 ctors.SeriesConstructors.time_series_constructor(<class 'list'>, True, 'float')
- 76.7±4ms 63.5±2ms 0.83 gil.ParallelGroupbyMethods.time_loop(4, 'max')
- 2.69±0.5ms 2.23±0.03ms 0.83 frame_methods.Lookup.time_frame_fancy_lookup
- 123±1ms 102±2ms 0.83 hash_functions.IsinWithRandomFloat.time_isin(<class 'numpy.float64'>, 900000)
- 1.96±0.1ms 1.62±0.03ms 0.83 dtypes.SelectDtypes.time_select_dtype_float_exclude('UInt8')
- 73.8±0.9ms 61.0±1ms 0.83 gil.ParallelGroupbyMethods.time_loop(4, 'sum')
- 134±2ms 111±1ms 0.83 hash_functions.IsinWithArangeSorted.time_isin(<class 'numpy.float64'>, 1000000)
- 74.7±4ms 61.4±1ms 0.82 gil.ParallelGroupbyMethods.time_loop(4, 'last')
- 92.0±0.8ms 75.2±1ms 0.82 join_merge.Align.time_series_align_left_monotonic
- 120±40μs 97.3±8μs 0.81 ctors.SeriesConstructors.time_series_constructor(<function no_change at 0x7f8dd9d97940>, True, 'float')
- 8.13±0.1ms 6.59±0.09ms 0.81 index_object.SetOperations.time_operation('int', 'symmetric_difference')
- 13.4±0.6ms 10.9±0.4ms 0.81 hash_functions.UniqueAndFactorizeArange.time_unique(12)
- 659±200ms 533±9ms 0.81 categoricals.Indexing.time_reindex_missing
- 7.14±1ms 5.71±0.07ms 0.80 algorithms.Factorize.time_factorize(False, True, 'boolean')
- 2.03±0.4ms 1.62±0.03ms 0.79 arithmetic.NumericInferOps.time_divide(<class 'numpy.uint8'>)
- 77.0±20ms 61.2±0.8ms 0.79 gil.ParallelGroupbyMethods.time_loop(4, 'mean')
- 7.72±1μs 6.11±0.5μs 0.79 index_cached_properties.IndexCache.time_is_all_dates('DatetimeIndex')
- 5.39±0.8ms 4.25±0.2ms 0.79 algorithms.Factorize.time_factorize(True, False, 'datetime64[ns]')
- 151±20ms 119±2ms 0.79 frame_methods.Duplicated.time_frame_duplicated
- 3.55±0.3ms 2.79±0.06ms 0.79 rolling.Quantile.time_quantile('DataFrame', 10, 'int', 0, 'midpoint')
- 37.6±4ms 29.5±0.4ms 0.79 frame_ctor.FromLists.time_frame_from_lists
- 9.55±0.1ms 7.44±0.1ms 0.78 index_object.SetOperations.time_operation('datetime', 'symmetric_difference')
- 38.8±3ms 30.0±0.6ms 0.77 arithmetic.IrregularOps.time_add
- 635±200μs 490±7μs 0.77 arithmetic.OffsetArrayArithmetic.time_add_dti_offset(<Day>)
- 7.32±0.05ms 5.59±0.2ms 0.76 hash_functions.IsinWithArange.time_isin(<class 'numpy.uint64'>, 8000, 0)
- 2.16±0.01ms 1.63±0.08ms 0.75 series_methods.NanOps.time_func('prod', 1000000, 'int8')
- 3.12±0.06ms 2.34±0.2ms 0.75 arithmetic.MixedFrameWithSeriesAxis.time_frame_op_with_series_axis0('sub')
- 7.10±0.1ms 5.13±0.07ms 0.72 hash_functions.IsinWithArange.time_isin(<class 'numpy.uint64'>, 2000, 0)
- 1.71±0.1ms 1.24±0.01ms 0.72 frame_methods.Iteration.time_items_cached
- 7.06±0.06ms 5.08±0.08ms 0.72 hash_functions.IsinWithArange.time_isin(<class 'numpy.uint64'>, 1000, 0)
- 23.2±3ms 16.6±0.07ms 0.72 frame_ctor.FromDictwithTimestamp.time_dict_with_timestamp_offsets(<Nano>)
- 51.5±30ms 36.6±1ms 0.71 gil.ParallelGroupbyMethods.time_loop(2, 'var')
- 214±20ms 152±0.8ms 0.71 frame_ctor.FromDicts.time_nested_dict_int64
- 132±80μs 92.4±6μs 0.70 ctors.SeriesConstructors.time_series_constructor(<function no_change at 0x7f8dd9d97940>, True, 'int')
- 1.42±0.3ms 991±9μs 0.70 dtypes.SelectDtypes.time_select_dtype_int_exclude('Int64')
- 718±200μs 500±8μs 0.70 categoricals.Constructor.time_interval
- 5.92±0.06ms 4.03±0.08ms 0.68 hash_functions.IsinWithArange.time_isin(<class 'numpy.int64'>, 8000, 0)
- 6.40±0.09ms 4.26±0.05ms 0.67 hash_functions.IsinWithArange.time_isin(<class 'numpy.uint64'>, 1000, -2)
- 5.63±0.02ms 3.74±0.05ms 0.66 hash_functions.IsinWithArange.time_isin(<class 'numpy.int64'>, 1000, 0)
- 1.73±0.02ms 1.14±0.08ms 0.66 series_methods.NanOps.time_func('sum', 1000000, 'int8')
- 6.17±0.4ms 4.02±0.06ms 0.65 arithmetic.IndexArithmetic.time_divide('int')
- 5.79±0.1ms 3.77±0.04ms 0.65 hash_functions.IsinWithArange.time_isin(<class 'numpy.int64'>, 2000, 0)
- 95.3±40ms 61.6±2ms 0.65 gil.ParallelGroupbyMethods.time_loop(4, 'prod')
- 50.1±20ms 31.1±0.5ms 0.62 gil.ParallelGroupbyMethods.time_loop(2, 'min')
- 5.13±0.4ms 2.97±0.03ms 0.58 hash_functions.IsinWithArange.time_isin(<class 'numpy.int64'>, 1000, -2)
- 4.02±0.5ms 2.30±0.1ms 0.57 arithmetic.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.int64'>, 5.0, <built-in function le>)
- 831±100μs 453±10μs 0.54 arithmetic.NumericInferOps.time_add(<class 'numpy.uint16'>)
- 763±10μs 414±100μs 0.54 indexing_engines.NumericEngineIndexing.time_get_loc((<class 'pandas._libs.index.Int8Engine'>, <class 'numpy.int8'>), 'monotonic_incr')
- 774±10μs 342±10μs 0.44 indexing_engines.NumericEngineIndexing.time_get_loc((<class 'pandas._libs.index.Int16Engine'>, <class 'numpy.int16'>), 'monotonic_incr')
- 26.1±0.5ms 8.32±0.3ms 0.32 hash_functions.IsinWithArange.time_isin(<class 'numpy.uint64'>, 2000, 2)
- 25.9±0.2ms 7.93±0.3ms 0.31 hash_functions.IsinWithArange.time_isin(<class 'numpy.uint64'>, 2000, -2)
- 24.0±0.1ms 6.87±0.2ms 0.29 hash_functions.IsinWithArange.time_isin(<class 'numpy.int64'>, 2000, 2)
- 24.5±0.2ms 6.70±0.2ms 0.27 hash_functions.IsinWithArange.time_isin(<class 'numpy.int64'>, 2000, -2)
- 92.1±1ms 130±0.9μs 0.00 indexing.NumericSeriesIndexing.time_loc_scalar(<class 'pandas.core.indexes.numeric.Float64Index'>, 'nonunique_monotonic_inc')
Other comparisons
Linear probing
before after ratio
[c852b2ee] [1dc7af75]
! 96.5±0.5ms failed n/a hash_functions.IsinWithArange.time_isin(<class 'object'>, 8000, -2)
! 110±1ms failed n/a hash_functions.IsinWithArange.time_isin(<class 'object'>, 8000, 2)
+ 109±1ms 12.8±0.01s 117.14 hash_functions.IsinWithArange.time_isin(<class 'object'>, 2000, -2)
+ 66.9±0.8ms 7.45±0.01s 111.28 hash_functions.IsinWithArange.time_isin(<class 'object'>, 1000, 2)
+ 143±1ms 14.9±0.04s 104.27 hash_functions.IsinWithArange.time_isin(<class 'object'>, 2000, 2)
+ 73.5±2ms 6.40±0.01s 87.12 hash_functions.IsinWithArange.time_isin(<class 'object'>, 1000, -2)
+ 24.2±0.9ms 611±2ms 25.30 hash_functions.IsinWithArange.time_isin(<class 'numpy.int64'>, 1000, 2)
+ 25.9±1ms 614±4ms 23.66 hash_functions.IsinWithArange.time_isin(<class 'numpy.uint64'>, 1000, 2)
+ 520±6ms 11.8±0.03s 22.73 hash_functions.IsinWithRandomFloat.time_isin_outside(<class 'object'>, 750000)
+ 21.1±0.1ms 394±10ms 18.68 hash_functions.IsinWithRandomFloat.time_isin_outside(<class 'object'>, 80000)
+ 628±4ms 11.1±0.07s 17.60 hash_functions.IsinWithRandomFloat.time_isin_outside(<class 'object'>, 900000)
+ 16.0±0.4ms 281±10ms 17.59 hash_functions.IsinWithRandomFloat.time_isin_outside(<class 'object'>, 70000)
+ 480±10ms 8.15±0.05s 16.97 hash_functions.IsinWithRandomFloat.time_isin(<class 'object'>, 750000)
+ 1.84±0.03ms 27.8±0.2ms 15.07 hash_functions.IsinWithRandomFloat.time_isin_outside(<class 'object'>, 8000)
+ 348±8μs 4.87±0.06ms 13.97 hash_functions.IsinWithRandomFloat.time_isin_outside(<class 'object'>, 1300)
+ 1.64±0.02ms 22.7±0.3ms 13.85 hash_functions.IsinWithRandomFloat.time_isin_outside(<class 'object'>, 7000)
+ 538±7μs 6.96±0.1ms 12.93 hash_functions.IsinWithRandomFloat.time_isin_outside(<class 'object'>, 2000)
+ 683±4ms 8.59±0.06s 12.59 hash_functions.IsinWithRandomFloat.time_isin(<class 'object'>, 900000)
+ 22.4±0.3ms 263±9ms 11.75 hash_functions.IsinWithRandomFloat.time_isin(<class 'object'>, 80000)
+ 1.90±0.02ms 21.9±0.6ms 11.50 hash_functions.IsinWithRandomFloat.time_isin(<class 'object'>, 8000)
+ 17.8±0.5ms 194±4ms 10.88 hash_functions.IsinWithRandomFloat.time_isin(<class 'object'>, 70000)
+ 1.69±0.02ms 17.9±0.2ms 10.58 hash_functions.IsinWithRandomFloat.time_isin(<class 'object'>, 7000)
+ 332±6μs 3.42±0.02ms 10.32 hash_functions.IsinWithRandomFloat.time_isin(<class 'object'>, 1300)
+ 537±5μs 5.41±0.03ms 10.08 hash_functions.IsinWithRandomFloat.time_isin(<class 'object'>, 2000)
+ 33.6±0.4ms 145±1ms 4.31 io.csv.ReadCSVConcatDatetime.time_read_csv
+ 6.47±0.05ms 26.5±0.1ms 4.09 join_merge.Merge.time_merge_dataframe_integer_2key(False)
+ 94.0±0.6ms 324±4ms 3.44 multiindex_object.GetLoc.time_large_get_loc
+ 16.2±0.1ms 53.1±0.5ms 3.28 join_merge.Merge.time_merge_dataframe_integer_2key(True)
+ 111±0.8ms 340±2ms 3.06 multiindex_object.GetLoc.time_large_get_loc_warm
+ 662±8μs 1.17±0.04ms 1.77 groupby.GroupByMethods.time_dtype_as_field('datetime', 'nunique', 'transformation')
+ 7.23±0.06ms 12.6±0.6ms 1.74 io.csv.ReadUint64Integers.time_read_uint64_neg_values
+ 665±6μs 1.15±0.01ms 1.72 groupby.GroupByMethods.time_dtype_as_field('datetime', 'nunique', 'direct')
+ 7.78±0.3ms 12.5±0.1ms 1.61 io.csv.ReadUint64Integers.time_read_uint64_na_values
+ 224±3μs 332±9μs 1.48 timeseries.DatetimeIndex.time_unique('dst')
+ 1.17±0.02ms 1.67±0.04ms 1.43 groupby.GroupByMethods.time_dtype_as_field('datetime', 'value_counts', 'direct')
+ 23.3±2ms 33.2±0.6ms 1.42 eval.Eval.time_and('numexpr', 'all')
+ 1.20±0.03ms 1.68±0.03ms 1.41 groupby.GroupByMethods.time_dtype_as_field('datetime', 'value_counts', 'transformation')
+ 6.76±0.1ms 8.67±0.9ms 1.28 timeseries.ResampleSeries.time_resample('datetime', '5min', 'ohlc')
+ 24.7±0.6ms 31.6±1ms 1.28 categoricals.SetCategories.time_set_categories
+ 142±5ms 178±9ms 1.25 gil.ParallelReadCSV.time_read_csv('float')
+ 13.0±0.2ms 16.1±0.3ms 1.24 categoricals.Isin.time_isin_categorical('int64')
+ 387±3ms 463±2ms 1.20 hash_functions.NumericSeriesIndexingShuffled.time_loc_slice(<class 'pandas.core.indexes.numeric.Int64Index'>, 5000000)
+ 2.10±0.03ms 2.50±0.1ms 1.19 io.csv.ReadCSVDInferDatetimeFormat.time_read_csv(False, 'ymd')
+ 386±2ms 459±2ms 1.19 hash_functions.NumericSeriesIndexingShuffled.time_loc_slice(<class 'pandas.core.indexes.numeric.UInt64Index'>, 5000000)
+ 4.37±0.01ms 5.12±0.2ms 1.17 timeseries.ResampleSeries.time_resample('datetime', '5min', 'mean')
+ 49.8±0.8ms 58.0±0.6ms 1.17 index_object.SetOperations.time_operation('date_string', 'symmetric_difference')
+ 7.59±0.06ms 8.78±0.7ms 1.16 frame_methods.MaskBool.time_frame_mask_floats
+ 2.86±0.03ms 3.29±0.1ms 1.15 io.csv.ReadCSVDInferDatetimeFormat.time_read_csv(True, 'ymd')
+ 16.4±0.2ms 18.6±0.3ms 1.13 categoricals.Isin.time_isin_categorical('object')
+ 12.7±0.2ms 14.4±0.4ms 1.13 algorithms.Hashing.time_series_string
- 4.05±0.1ms 3.68±0.07ms 0.91 hash_functions.IsinWithRandomFloat.time_isin(<class 'numpy.float64'>, 80000)
- 1.38±0.04ms 1.25±0.01ms 0.91 hash_functions.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.uint64'>, 16)
- 1.80±0.06μs 1.63±0.02μs 0.91 index_object.Float64IndexMethod.time_get_loc
- 245±5ms 221±4ms 0.90 indexing.NumericSeriesIndexing.time_getitem_array(<class 'pandas.core.indexes.numeric.Float64Index'>, 'nonunique_monotonic_inc')
- 3.91±0.2ms 3.52±0.1ms 0.90 stat_ops.FrameMultiIndexOps.time_op(1, 'var')
- 5.78±0.05ms 5.21±0.08ms 0.90 reindex.DropDuplicates.time_frame_drop_dups_int(True)
- 14.9±0.4ms 13.4±0.2ms 0.90 hash_functions.IsinAlmostFullWithRandomInt.time_isin_outside(<class 'numpy.int64'>, 19)
- 24.5±0.2ms 22.0±0.2ms 0.90 hash_functions.IsinWithArange.time_isin(<class 'object'>, 2000, 0)
- 4.84±0.1ms 4.35±0.07ms 0.90 stat_ops.SeriesMultiIndexOps.time_op(1, 'sem')
- 2.54±0.07ms 2.28±0.02ms 0.90 hash_functions.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.int64'>, 17)
- 3.14±0.07ms 2.81±0.04ms 0.90 stat_ops.FrameMultiIndexOps.time_op(0, 'mean')
- 3.47±0.07ms 3.11±0.04ms 0.90 stat_ops.SeriesMultiIndexOps.time_op(0, 'var')
- 3.26±0.2ms 2.92±0.05ms 0.90 hash_functions.IsinWithRandomFloat.time_isin(<class 'numpy.float64'>, 70000)
- 249±9ms 222±5ms 0.89 indexing.NumericSeriesIndexing.time_loc_array(<class 'pandas.core.indexes.numeric.Float64Index'>, 'nonunique_monotonic_inc')
- 2.04±0.2μs 1.83±0.2μs 0.89 index_cached_properties.IndexCache.time_values('PeriodIndex')
- 1.28±0ms 1.14±0.02ms 0.89 hash_functions.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.int64'>, 16)
- 295±6μs 263±2μs 0.89 hash_functions.IsinAlmostFullWithRandomInt.time_isin_outside(<class 'numpy.int64'>, 13)
- 26.2±0.6ms 23.3±0.1ms 0.89 hash_functions.IsinAlmostFullWithRandomInt.time_isin_outside(<class 'numpy.float64'>, 19)
- 3.15±0.1ms 2.81±0.04ms 0.89 stat_ops.FrameMultiIndexOps.time_op(0, 'sum')
- 672±10μs 598±8μs 0.89 hash_functions.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.int64'>, 15)
- 5.07±0.2ms 4.51±0.1ms 0.89 stat_ops.SeriesMultiIndexOps.time_op(0, 'sem')
- 3.68±0.5μs 3.27±0.2μs 0.89 index_cached_properties.IndexCache.time_inferred_type('CategoricalIndex')
- 770±8μs 681±10μs 0.88 series_methods.IsInForObjects.time_isin_nans
- 3.93±0.5μs 3.48±0.2μs 0.88 index_cached_properties.IndexCache.time_values('IntervalIndex')
- 1.41±0.02ms 1.25±0.02ms 0.88 series_methods.ValueCounts.time_value_counts('int')
- 3.23±0.08ms 2.85±0.06ms 0.88 reindex.DropDuplicates.time_frame_drop_dups_bool(False)
- 39.7±0.5ms 34.9±0.8ms 0.88 hash_functions.IsinAlmostFullWithRandomInt.time_isin_outside(<class 'numpy.uint64'>, 20)
- 3.19±0.07ms 2.81±0.04ms 0.88 stat_ops.FrameMultiIndexOps.time_op(1, 'sum')
- 316±20μs 278±1μs 0.88 hash_functions.IsinAlmostFullWithRandomInt.time_isin_outside(<class 'numpy.uint64'>, 13)
- 12.4±0.3ms 10.9±0.3ms 0.88 hash_functions.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.float64'>, 18)
- 1.45±0.03ms 1.27±0.04ms 0.88 series_methods.ValueCounts.time_value_counts('uint')
- 2.98±0.1ms 2.62±0.06ms 0.88 stat_ops.SeriesMultiIndexOps.time_op(0, 'prod')
- 17.0±0.3ms 14.9±0.1ms 0.88 hash_functions.IsinAlmostFullWithRandomInt.time_isin_outside(<class 'numpy.uint64'>, 19)
- 65.3±1ms 57.1±2ms 0.87 gil.ParallelGroupbyMethods.time_loop(4, 'last')
- 3.90±0.05ms 3.41±0.02ms 0.87 timeseries.DatetimeIndex.time_add_timedelta('tz_aware')
- 27.0±1ms 23.6±0.6ms 0.87 reindex.DropDuplicates.time_frame_drop_dups_int(False)
- 16.3±0.6ms 14.2±0.2ms 0.87 hash_functions.UniqueAndFactorizeArange.time_factorize(8)
- 7.44±0.1ms 6.49±0.2ms 0.87 algorithms.Duplicated.time_duplicated(False, 'last', 'float')
- 134±2ms 117±1ms 0.87 gil.ParallelGroupbyMethods.time_loop(8, 'max')
- 4.97±0.06ms 4.33±0.05ms 0.87 index_object.Indexing.time_get_loc_non_unique('Float')
- 37.3±0.5ms 32.4±0.2ms 0.87 algorithms.Factorize.time_factorize(False, False, 'string')
- 7.65±0.08ms 6.65±0.08ms 0.87 algorithms.Factorize.time_factorize(True, True, 'datetime64[ns]')
- 132±2ms 114±3ms 0.87 gil.ParallelGroupbyMethods.time_loop(8, 'sum')
- 159±6ms 137±1ms 0.86 gil.ParallelGroupbyMethods.time_loop(8, 'var')
- 1.79±0.03ms 1.54±0.03ms 0.86 timeseries.DatetimeAccessor.time_dt_accessor_normalize(None)
- 7.81±0.1ms 6.74±0.1ms 0.86 io.hdf.HDFStoreDataFrame.time_read_store_table
- 1.05±0.04ms 903±10μs 0.86 groupby.GroupByMethods.time_dtype_as_group('int', 'cummax', 'transformation')
- 78.2±3ms 67.4±1ms 0.86 gil.ParallelGroupbyMethods.time_loop(4, 'var')
- 3.32±0.03ms 2.86±0.07ms 0.86 timeseries.DatetimeIndex.time_unique('tz_aware')
- 33.9±0.9ms 29.2±0.7ms 0.86 gil.ParallelGroupbyMethods.time_loop(2, 'max')
- 39.3±1ms 33.8±0.3ms 0.86 gil.ParallelGroupbyMethods.time_loop(2, 'var')
- 34.3±0.6ms 29.4±0.9ms 0.86 gil.ParallelGroupbyMethods.time_loop(2, 'prod')
- 1.79±0.01ms 1.54±0.02ms 0.86 timeseries.DatetimeAccessor.time_dt_accessor_normalize('UTC')
- 29.6±2ms 25.3±0.4ms 0.85 hash_functions.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.float64'>, 19)
- 134±7ms 114±2ms 0.85 gil.ParallelGroupbyMethods.time_loop(8, 'mean')
- 133±3ms 114±2ms 0.85 gil.ParallelGroupbyMethods.time_loop(8, 'min')
- 6.62±0.2ms 5.65±0.2ms 0.85 stat_ops.FrameMultiIndexOps.time_op([0, 1], 'sum')
- 2.63±0.1ms 2.24±0.04ms 0.85 hash_functions.NumericSeriesIndexingShuffled.time_loc_slice(<class 'pandas.core.indexes.numeric.Float64Index'>, 500000)
- 67.8±0.6ms 57.8±1ms 0.85 gil.ParallelGroupbyMethods.time_loop(4, 'prod')
- 135±4ms 114±2ms 0.85 gil.ParallelGroupbyMethods.time_loop(8, 'prod')
- 147±3ms 125±1ms 0.85 hash_functions.IsinAlmostFullWithRandomInt.time_isin(<class 'object'>, 20)
- 35.1±0.5ms 29.7±0.6ms 0.85 hash_functions.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.uint64'>, 20)
- 3.34±0.09ms 2.82±0.05ms 0.84 timeseries.DatetimeIndex.time_unique('tz_local')
- 133±4ms 112±2ms 0.84 gil.ParallelGroupbyMethods.time_loop(8, 'last')
- 68.0±0.9ms 57.3±1ms 0.84 hash_functions.IsinAlmostFullWithRandomInt.time_isin(<class 'object'>, 19)
- 482±3μs 406±5μs 0.84 indexing.NumericSeriesIndexing.time_loc_slice(<class 'pandas.core.indexes.numeric.Float64Index'>, 'nonunique_monotonic_inc')
- 5.10±0.1ms 4.29±0.06ms 0.84 stat_ops.SeriesMultiIndexOps.time_op(1, 'median')
- 67.4±1ms 56.8±0.8ms 0.84 gil.ParallelGroupbyMethods.time_loop(4, 'min')
- 2.97±0.07ms 2.50±0.06ms 0.84 stat_ops.SeriesMultiIndexOps.time_op(1, 'prod')
- 12.2±0.7ms 10.3±0.2ms 0.84 hash_functions.UniqueAndFactorizeArange.time_unique(11)
- 7.60±0.6ms 6.38±0.6ms 0.84 algorithms.Duplicated.time_duplicated(False, False, 'datetime64[ns]')
- 1.45±0.03ms 1.21±0.03ms 0.84 timeseries.DatetimeIndex.time_normalize('repeated')
- 31.9±0.8ms 26.7±0.5ms 0.84 hash_functions.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.int64'>, 20)
- 1.83±0.04ms 1.53±0.03ms 0.84 timeseries.DatetimeAccessor.time_dt_accessor_normalize(tzutc())
- 12.1±0.5ms 10.2±0.2ms 0.84 hash_functions.UniqueAndFactorizeArange.time_unique(13)
- 6.82±0.1ms 5.70±0.2ms 0.83 algorithms.Factorize.time_factorize(False, False, 'Int64')
- 12.1±0.4ms 10.1±0.3ms 0.83 hash_functions.UniqueAndFactorizeArange.time_unique(8)
- 12.2±0.5ms 10.1±0.3ms 0.83 hash_functions.UniqueAndFactorizeArange.time_unique(15)
- 34.1±0.7ms 28.2±0.5ms 0.83 gil.ParallelGroupbyMethods.time_loop(2, 'sum')
- 71.4±0.8ms 58.8±1ms 0.82 hash_functions.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.float64'>, 20)
- 12.3±0.3ms 10.1±0.1ms 0.82 hash_functions.UniqueAndFactorizeArange.time_unique(9)
- 17.4±0.8ms 14.3±0.5ms 0.82 hash_functions.UniqueAndFactorizeArange.time_factorize(5)
- 620±9μs 507±5μs 0.82 categoricals.Constructor.time_interval
- 7.82±0.2ms 6.39±0.3ms 0.82 algorithms.Duplicated.time_duplicated(False, False, 'datetime64[ns, tz]')
- 104±1ms 84.8±1ms 0.82 join_merge.Align.time_series_align_int64_index
- 12.4±0.5ms 10.1±0.2ms 0.82 hash_functions.UniqueAndFactorizeArange.time_unique(10)
- 758±4ms 618±4ms 0.82 join_merge.I8Merge.time_i8merge('right')
- 706±3ms 575±7ms 0.81 join_merge.I8Merge.time_i8merge('outer')
- 732±5ms 596±4ms 0.81 join_merge.I8Merge.time_i8merge('left')
- 4.29±0.1ms 3.49±0.08ms 0.81 algorithms.Duplicated.time_duplicated(False, 'first', 'int')
- 12.3±0.5ms 10.0±0.2ms 0.81 hash_functions.UniqueAndFactorizeArange.time_unique(5)
- 12.4±0.6ms 10.0±0.3ms 0.81 hash_functions.UniqueAndFactorizeArange.time_unique(12)
- 709±4ms 576±4ms 0.81 join_merge.I8Merge.time_i8merge('inner')
- 4.08±0.03ms 3.31±0.04ms 0.81 algorithms.Duplicated.time_duplicated(False, 'first', 'uint')
- 3.00±0.02ms 2.42±0.05ms 0.81 indexing.NumericSeriesIndexing.time_loc_list_like(<class 'pandas.core.indexes.numeric.Float64Index'>, 'nonunique_monotonic_inc')
- 12.4±0.5ms 9.99±0.3ms 0.81 hash_functions.UniqueAndFactorizeArange.time_unique(4)
- 260±2μs 209±7μs 0.80 hash_functions.NumericSeriesIndexing.time_loc_slice(<class 'pandas.core.indexes.numeric.Float64Index'>, 500000)
- 492±10μs 395±10μs 0.80 hash_functions.NumericSeriesIndexing.time_loc_slice(<class 'pandas.core.indexes.numeric.Float64Index'>, 1000000)
- 113±1ms 90.9±1ms 0.80 hash_functions.IsinWithRandomFloat.time_isin_outside(<class 'numpy.float64'>, 900000)
- 6.25±0.1ms 5.00±0.1ms 0.80 algorithms.Factorize.time_factorize(False, False, 'uint')
- 2.52±0.2ms 2.01±0.02ms 0.80 timeseries.DatetimeIndex.time_add_timedelta('tz_naive')
- 88.0±0.8ms 70.0±0.6ms 0.80 multiindex_object.Duplicated.time_duplicated
- 73.0±0.2μs 58.0±0.4μs 0.79 indexing.NumericSeriesIndexing.time_getitem_scalar(<class 'pandas.core.indexes.numeric.Float64Index'>, 'nonunique_monotonic_inc')
- 83.1±2ms 66.0±0.5ms 0.79 series_methods.IsInFloat64.time_isin_many_different
- 4.37±0.9μs 3.46±0.2μs 0.79 index_cached_properties.IndexCache.time_values('UInt64Index')
- 4.51±0.2ms 3.57±0.2ms 0.79 algorithms.Duplicated.time_duplicated(False, False, 'int')
- 5.02±0.03ms 3.95±0.04ms 0.79 algorithms.Factorize.time_factorize(True, False, 'datetime64[ns]')
- 7.90±1ms 6.18±0.3ms 0.78 stat_ops.FrameMultiIndexOps.time_op([0, 1], 'var')
- 89.9±2ms 70.3±2ms 0.78 join_merge.Align.time_series_align_left_monotonic
- 5.00±0.09ms 3.91±0.08ms 0.78 algorithms.Factorize.time_factorize(True, False, 'datetime64[ns, tz]')
- 1.93±0.2ms 1.51±0.02ms 0.78 series_methods.NanOps.time_func('prod', 1000000, 'int8')
- 3.51±0.09ms 2.73±0.02ms 0.78 indexing.NumericSeriesIndexing.time_getitem_list_like(<class 'pandas.core.indexes.numeric.Float64Index'>, 'nonunique_monotonic_inc')
- 6.69±0.09ms 5.17±0.2ms 0.77 algorithms.Factorize.time_factorize(False, True, 'boolean')
- 2.33±0.6μs 1.80±0.1μs 0.77 index_cached_properties.IndexCache.time_inferred_type('PeriodIndex')
- 9.39±0.3ms 7.24±0.2ms 0.77 index_object.SetOperations.time_operation('datetime', 'symmetric_difference')
- 2.71±0.02ms 2.08±0.03ms 0.77 series_methods.IsInForObjects.time_isin_long_series_short_values
- 1.49±0.03ms 1.14±0.02ms 0.77 algorithms.Factorize.time_factorize(True, True, 'boolean')
- 137±3ms 105±3ms 0.76 frame_methods.Duplicated.time_frame_duplicated
- 74.6±3ms 56.9±0.5ms 0.76 hash_functions.IsinWithRandomFloat.time_isin(<class 'numpy.float64'>, 750000)
- 3.68±0.8μs 2.79±0.2μs 0.76 index_cached_properties.IndexCache.time_shape('PeriodIndex')
- 23.4±0.6ms 17.6±0.1ms 0.75 hash_functions.IsinWithArange.time_isin(<class 'numpy.uint64'>, 8000, 2)
- 2.55±0.08ms 1.91±0.05ms 0.75 arithmetic.FrameWithFrameWide.time_op_same_blocks(<built-in function add>)
- 21.6±0.3ms 16.2±0.3ms 0.75 hash_functions.IsinWithArange.time_isin(<class 'numpy.int64'>, 8000, 2)
- 156±0.3μs 117±1μs 0.75 indexing.NumericSeriesIndexing.time_loc_scalar(<class 'pandas.core.indexes.numeric.Float64Index'>, 'nonunique_monotonic_inc')
- 133±2ms 99.4±1ms 0.75 hash_functions.IsinWithArangeSorted.time_isin(<class 'numpy.float64'>, 1000000)
- 10.3±0.3ms 7.59±0.2ms 0.74 hash_functions.NumericSeriesIndexingShuffled.time_loc_slice(<class 'pandas.core.indexes.numeric.Float64Index'>, 1000000)
- 1.17±0.01ms 861±10μs 0.74 algorithms.Factorize.time_factorize(True, False, 'boolean')
- 124±2ms 90.7±0.2ms 0.73 hash_functions.IsinWithRandomFloat.time_isin(<class 'numpy.float64'>, 900000)
- 589±8ms 432±2ms 0.73 hash_functions.NumericSeriesIndexing.time_loc_slice(<class 'pandas.core.indexes.numeric.Float64Index'>, 5000000)
- 24.0±0.7ms 17.5±0.2ms 0.73 hash_functions.IsinWithArange.time_isin(<class 'numpy.uint64'>, 8000, -2)
- 587±8ms 428±4ms 0.73 hash_functions.NumericSeriesIndexingShuffled.time_loc_slice(<class 'pandas.core.indexes.numeric.Float64Index'>, 5000000)
- 22.4±0.5ms 16.2±0.1ms 0.72 hash_functions.IsinWithArange.time_isin(<class 'numpy.int64'>, 8000, -2)
- 533±6μs 379±1μs 0.71 indexing.NumericSeriesIndexing.time_getitem_slice(<class 'pandas.core.indexes.numeric.Float64Index'>, 'nonunique_monotonic_inc')
- 7.54±0.2ms 5.34±0.1ms 0.71 hash_functions.IsinWithArange.time_isin(<class 'numpy.uint64'>, 8000, 0)
- 47.8±0.3ms 33.7±0.2ms 0.70 series_methods.IsInFloat64.time_isin_few_different
- 48.0±0.9ms 33.7±0.3ms 0.70 series_methods.IsInFloat64.time_isin_nan_values
- 6.69±0.4ms 4.70±0.1ms 0.70 hash_functions.IsinWithArange.time_isin(<class 'numpy.uint64'>, 1000, -2)
- 7.25±0.1ms 5.06±0.05ms 0.70 hash_functions.IsinWithArange.time_isin(<class 'numpy.uint64'>, 1000, 0)
- 7.33±0.3ms 5.08±0.04ms 0.69 hash_functions.IsinWithArange.time_isin(<class 'numpy.uint64'>, 2000, 0)
- 4.53±0.1ms 3.12±0.04ms 0.69 algorithms.Duplicated.time_duplicated(False, False, 'uint')
- 964±30μs 662±20μs 0.69 timeseries.DatetimeIndex.time_unique('repeated')
- 6.20±0.1ms 4.02±0.02ms 0.65 hash_functions.IsinWithArange.time_isin(<class 'numpy.int64'>, 8000, 0)
- 5.73±0.1ms 3.65±0.03ms 0.64 hash_functions.IsinWithArange.time_isin(<class 'numpy.int64'>, 1000, 0)
- 5.37±0.05ms 3.31±0.07ms 0.62 hash_functions.IsinWithArange.time_isin(<class 'numpy.int64'>, 1000, -2)
- 6.12±0.1ms 3.70±0.1ms 0.60 hash_functions.IsinWithArange.time_isin(<class 'numpy.int64'>, 2000, 0)
- 641±100μs 295±3μs 0.46 indexing_engines.NumericEngineIndexing.time_get_loc((<class 'pandas._libs.index.Int8Engine'>, <class 'numpy.int8'>), 'monotonic_incr')
- 26.9±0.6ms 6.48±0.1ms 0.24 hash_functions.IsinWithArange.time_isin(<class 'numpy.uint64'>, 2000, 2)
- 26.7±1ms 6.20±0.1ms 0.23 hash_functions.IsinWithArange.time_isin(<class 'numpy.uint64'>, 2000, -2)
- 25.3±0.8ms 5.08±0.1ms 0.20 hash_functions.IsinWithArange.time_isin(<class 'numpy.int64'>, 2000, 2)
- 25.1±0.4ms 4.80±0.07ms 0.19 hash_functions.IsinWithArange.time_isin(<class 'numpy.int64'>, 2000, -2)
Quadratic probing:
before after ratio
[c852b2ee] [59600e58]
+ 533±5ms 15.3±0.03s 28.66 hash_functions.IsinWithRandomFloat.time_isin_outside(<class 'object'>, 750000)
+ 474±8ms 11.8±0s 24.81 hash_functions.IsinWithRandomFloat.time_isin(<class 'object'>, 750000)
+ 21.6±0.5ms 502±10ms 23.24 hash_functions.IsinWithRandomFloat.time_isin_outside(<class 'object'>, 80000)
+ 16.5±0.4ms 339±20ms 20.50 hash_functions.IsinWithRandomFloat.time_isin_outside(<class 'object'>, 70000)
+ 631±6ms 12.1±0.02s 19.25 hash_functions.IsinWithRandomFloat.time_isin_outside(<class 'object'>, 900000)
+ 22.5±0.5ms 386±10ms 17.19 hash_functions.IsinWithRandomFloat.time_isin(<class 'object'>, 80000)
+ 18.2±0.5ms 280±10ms 15.33 hash_functions.IsinWithRandomFloat.time_isin(<class 'object'>, 70000)
+ 111±2ms 1.44±0.01s 13.04 hash_functions.IsinWithArange.time_isin(<class 'object'>, 8000, 2)
+ 1.85±0.05ms 23.9±0.3ms 12.92 hash_functions.IsinWithRandomFloat.time_isin_outside(<class 'object'>, 8000)
+ 95.4±0.8ms 1.22±0.01s 12.82 hash_functions.IsinWithArange.time_isin(<class 'object'>, 8000, -2)
+ 1.95±0.05ms 23.8±0.6ms 12.22 hash_functions.IsinWithRandomFloat.time_isin(<class 'object'>, 8000)
+ 345±20μs 4.04±0.05ms 11.71 hash_functions.IsinWithRandomFloat.time_isin_outside(<class 'object'>, 1300)
+ 1.64±0.02ms 19.1±0.2ms 11.66 hash_functions.IsinWithRandomFloat.time_isin_outside(<class 'object'>, 7000)
+ 1.68±0.02ms 18.9±0.3ms 11.25 hash_functions.IsinWithRandomFloat.time_isin(<class 'object'>, 7000)
+ 527±6μs 5.58±0.07ms 10.58 hash_functions.IsinWithRandomFloat.time_isin(<class 'object'>, 2000)
+ 332±10μs 3.49±0.04ms 10.53 hash_functions.IsinWithRandomFloat.time_isin(<class 'object'>, 1300)
+ 541±9μs 5.43±0.09ms 10.05 hash_functions.IsinWithRandomFloat.time_isin_outside(<class 'object'>, 2000)
+ 66.5±0.5ms 466±3ms 7.01 hash_functions.IsinWithArange.time_isin(<class 'object'>, 1000, 2)
+ 72.8±1ms 424±2ms 5.83 hash_functions.IsinWithArange.time_isin(<class 'object'>, 1000, -2)
+ 108±9ms 589±4ms 5.44 hash_functions.IsinWithArange.time_isin(<class 'object'>, 2000, -2)
+ 143±9ms 657±6ms 4.59 hash_functions.IsinWithArange.time_isin(<class 'object'>, 2000, 2)
+ 6.57±0.1ms 16.2±0.3ms 2.46 join_merge.Merge.time_merge_dataframe_integer_2key(False)
+ 23.0±0.2ms 48.9±0.08ms 2.13 hash_functions.IsinWithArange.time_isin(<class 'numpy.int64'>, 1000, 2)
+ 16.3±0.2ms 34.0±0.4ms 2.09 join_merge.Merge.time_merge_dataframe_integer_2key(True)
+ 24.6±0.1ms 50.2±0.4ms 2.04 hash_functions.IsinWithArange.time_isin(<class 'numpy.uint64'>, 1000, 2)
+ 34.0±0.5ms 43.3±0.4ms 1.27 io.csv.ReadCSVConcatDatetime.time_read_csv
+ 96.8±2ms 118±1ms 1.22 multiindex_object.GetLoc.time_large_get_loc
+ 1.71±0.2μs 2.09±0.3μs 1.22 index_cached_properties.IndexCache.time_inferred_type('PeriodIndex')
+ 114±1ms 135±1ms 1.19 multiindex_object.GetLoc.time_large_get_loc_warm
+ 228±3μs 260±8μs 1.14 timeseries.DatetimeIndex.time_unique('dst')
+ 613±30ns 693±30ns 1.13 index_cached_properties.IndexCache.time_inferred_type('Int64Index')
+ 7.97±0.09ms 8.99±0.8ms 1.13 inference.ToNumericDowncast.time_downcast('int32', 'signed')
+ 6.37±0.4μs 7.02±0.6μs 1.10 index_cached_properties.IndexCache.time_engine('DatetimeIndex')
- 4.95±0.08ms 4.50±0.03ms 0.91 indexing.NonNumericSeriesIndexing.time_getitem_list_like('period', 'non_monotonic')
- 28.6±0.8ms 25.9±0.4ms 0.91 hash_functions.IsinWithArange.time_isin(<class 'object'>, 8000, 0)
- 3.25±0.05ms 2.94±0.01ms 0.90 stat_ops.FrameMultiIndexOps.time_op(0, 'prod')
- 65.9±3ms 59.5±0.5ms 0.90 timedelta.ToTimedeltaErrors.time_convert('ignore')
- 1.88±0.2μs 1.70±0.1μs 0.90 index_cached_properties.IndexCache.time_values('PeriodIndex')
- 3.74±0.5μs 3.37±0.2μs 0.90 index_cached_properties.IndexCache.time_values('CategoricalIndex')
- 11.3±1μs 10.2±0.8μs 0.90 index_cached_properties.IndexCache.time_engine('TimedeltaIndex')
- 3.08±0.02ms 2.77±0.06ms 0.90 stat_ops.FrameMultiIndexOps.time_op(0, 'mean')
- 7.27±0.2ms 6.52±0.1ms 0.90 algorithms.Duplicated.time_duplicated(False, 'first', 'float')
- 155±2ms 139±0.6ms 0.90 gil.ParallelGroupbyMethods.time_loop(8, 'var')
- 3.97±0.3μs 3.56±0.2μs 0.90 index_cached_properties.IndexCache.time_values('Float64Index')
- 1.98±0.09ms 1.76±0.06ms 0.89 indexing.NumericSeriesIndexing.time_loc_list_like(<class 'pandas.core.indexes.numeric.Int64Index'>, 'nonunique_monotonic_inc')
- 387±2ms 345±1ms 0.89 hash_functions.NumericSeriesIndexingShuffled.time_loc_slice(<class 'pandas.core.indexes.numeric.Int64Index'>, 5000000)
- 32.6±0.3ms 29.1±0.8ms 0.89 arithmetic.IrregularOps.time_add
- 390±3ms 348±2ms 0.89 hash_functions.NumericSeriesIndexingShuffled.time_loc_slice(<class 'pandas.core.indexes.numeric.UInt64Index'>, 5000000)
- 16.2±0.09ms 14.4±0.2ms 0.89 categoricals.Isin.time_isin_categorical('object')
- 3.85±0.09ms 3.43±0.03ms 0.89 timeseries.DatetimeIndex.time_add_timedelta('tz_aware')
- 67.8±5ms 60.2±0.3ms 0.89 timedelta.ToTimedeltaErrors.time_convert('coerce')
- 68.5±0.6ms 60.7±0.4ms 0.89 hash_functions.IsinWithRandomFloat.time_isin_outside(<class 'numpy.float64'>, 750000)
- 66.6±2ms 58.9±0.6ms 0.88 gil.ParallelGroupbyMethods.time_loop(4, 'prod')
- 16.4±0.1ms 14.5±0.3ms 0.88 hash_functions.UniqueAndFactorizeArange.time_factorize(7)
- 8.63±0.4μs 7.62±0.09μs 0.88 tslibs.timestamp.TimestampProperties.time_is_quarter_end(None, 'B')
- 133±0.9ms 117±2ms 0.88 gil.ParallelGroupbyMethods.time_loop(8, 'prod')
- 70.4±3ms 62.0±0.9ms 0.88 gil.ParallelGroupbyMethods.time_loop(4, 'max')
- 16.3±0.4ms 14.4±0.3ms 0.88 hash_functions.UniqueAndFactorizeArange.time_factorize(14)
- 133±1ms 117±3ms 0.88 gil.ParallelGroupbyMethods.time_loop(8, 'last')
- 16.3±0.2ms 14.3±0.4ms 0.88 hash_functions.UniqueAndFactorizeArange.time_factorize(9)
- 67.3±1ms 59.1±0.5ms 0.88 gil.ParallelGroupbyMethods.time_loop(4, 'min')
- 147±2ms 130±1ms 0.88 hash_functions.IsinAlmostFullWithRandomInt.time_isin(<class 'object'>, 20)
- 14.9±0.4ms 13.1±0.2ms 0.88 hash_functions.IsinAlmostFullWithRandomInt.time_isin_outside(<class 'numpy.int64'>, 19)
- 60.4±0.3ms 52.9±0.6ms 0.88 hash_functions.IsinAlmostFullWithRandomInt.time_isin_outside(<class 'numpy.float64'>, 20)
- 17.4±0.4ms 15.3±0.1ms 0.88 groupby.MultiColumn.time_col_select_numpy_sum
- 29.3±1ms 25.6±0.2ms 0.87 groupby.Groups.time_series_groups('int64_small')
- 134±2ms 117±2ms 0.87 gil.ParallelGroupbyMethods.time_loop(8, 'sum')
- 133±2ms 116±2ms 0.87 gil.ParallelGroupbyMethods.time_loop(8, 'mean')
- 9.54±0.2ms 8.32±0.2ms 0.87 algorithms.Factorize.time_factorize(False, True, 'uint')
- 33.8±0.8ms 29.4±0.3ms 0.87 gil.ParallelGroupbyMethods.time_loop(2, 'prod')
- 7.71±0.2ms 6.71±0.1ms 0.87 timeseries.ResampleSeries.time_resample('datetime', '5min', 'ohlc')
- 26.1±0.4ms 22.7±0.3ms 0.87 hash_functions.IsinAlmostFullWithRandomInt.time_isin_outside(<class 'numpy.float64'>, 19)
- 40.2±1ms 34.8±0.4ms 0.87 gil.ParallelGroupbyMethods.time_loop(2, 'var')
- 1.47±0.02ms 1.27±0.02ms 0.86 timeseries.DatetimeIndex.time_normalize('tz_naive')
- 2.85±0.2ms 2.46±0.06ms 0.86 stat_ops.SeriesMultiIndexOps.time_op(0, 'mean')
- 16.5±0.09ms 14.3±0.2ms 0.86 hash_functions.UniqueAndFactorizeArange.time_factorize(5)
- 1.79±0.01ms 1.54±0.01ms 0.86 timeseries.DatetimeAccessor.time_dt_accessor_normalize('UTC')
- 1.45±0.03ms 1.25±0.01ms 0.86 timeseries.DatetimeIndex.time_normalize('repeated')
- 199±4ms 171±4ms 0.86 sparse.SparseSeriesToFrame.time_series_to_frame
- 16.5±0.1ms 14.2±0.2ms 0.86 hash_functions.UniqueAndFactorizeArange.time_factorize(13)
- 34.2±0.3ms 29.4±0.4ms 0.86 gil.ParallelGroupbyMethods.time_loop(2, 'max')
- 16.5±0.4ms 14.2±0.5ms 0.86 hash_functions.UniqueAndFactorizeArange.time_factorize(11)
- 16.4±0.3ms 14.1±0.4ms 0.86 hash_functions.UniqueAndFactorizeArange.time_factorize(15)
- 6.87±0.2ms 5.88±0.2ms 0.86 stat_ops.FrameMultiIndexOps.time_op([0, 1], 'prod')
- 1.45±0.03ms 1.24±0.04ms 0.85 series_methods.ValueCounts.time_value_counts('int')
- 34.0±0.4ms 29.0±0.4ms 0.85 gil.ParallelGroupbyMethods.time_loop(2, 'min')
- 16.7±0.5ms 14.3±0.1ms 0.85 hash_functions.UniqueAndFactorizeArange.time_factorize(12)
- 68.9±1ms 58.7±0.4ms 0.85 gil.ParallelGroupbyMethods.time_loop(4, 'mean')
- 16.8±0.6ms 14.3±0.5ms 0.85 hash_functions.UniqueAndFactorizeArange.time_factorize(8)
- 6.76±0.2ms 5.73±0.3ms 0.85 stat_ops.FrameMultiIndexOps.time_op([0, 1], 'mean')
- 29.0±1ms 24.6±0.6ms 0.85 reindex.DropDuplicates.time_frame_drop_dups_int(False)
- 2.92±0.07ms 2.48±0.1ms 0.85 stat_ops.SeriesMultiIndexOps.time_op(1, 'sum')
- 4.26±0.1ms 3.61±0.1ms 0.85 algorithms.Duplicated.time_duplicated(False, 'first', 'int')
- 740±5ms 626±3ms 0.85 join_merge.I8Merge.time_i8merge('right')
- 246±6μs 208±7μs 0.85 hash_functions.NumericSeriesIndexing.time_loc_slice(<class 'pandas.core.indexes.numeric.Float64Index'>, 500000)
- 2.69±0.07ms 2.27±0.08ms 0.85 hash_functions.NumericSeriesIndexingShuffled.time_loc_slice(<class 'pandas.core.indexes.numeric.Float64Index'>, 500000)
- 33.9±1ms 28.7±0.3ms 0.84 gil.ParallelGroupbyMethods.time_loop(2, 'sum')
- 7.89±0.4ms 6.66±0.2ms 0.84 algorithms.Factorize.time_factorize(True, True, 'datetime64[ns]')
- 2.88±0.09ms 2.42±0.05ms 0.84 stat_ops.SeriesMultiIndexOps.time_op(0, 'sum')
- 5.77±0.5μs 4.86±0.4μs 0.84 index_cached_properties.IndexCache.time_shape('UInt64Index')
- 29.1±0.3ms 24.5±0.3ms 0.84 hash_functions.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.float64'>, 19)
- 3.39±0.08ms 2.84±0.03ms 0.84 timeseries.DatetimeIndex.time_unique('tz_naive')
- 70.8±0.9ms 59.1±1ms 0.84 hash_functions.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.float64'>, 20)
- 9.32±0.4ms 7.75±0.4ms 0.83 algorithms.Factorize.time_factorize(False, False, 'datetime64[ns]')
- 4.08±0.02ms 3.39±0.04ms 0.83 algorithms.Duplicated.time_duplicated(False, 'first', 'uint')
- 722±4ms 598±5ms 0.83 join_merge.I8Merge.time_i8merge('left')
- 4.26±0.2ms 3.53±0.2ms 0.83 algorithms.Duplicated.time_duplicated(False, 'last', 'int')
- 700±5ms 578±6ms 0.83 join_merge.I8Merge.time_i8merge('outer')
- 12.3±0.6ms 10.1±0.5ms 0.82 hash_functions.UniqueAndFactorizeArange.time_unique(5)
- 10.4±0.1ms 8.55±0.1ms 0.82 timeseries.ResampleSeries.time_resample('period', '5min', 'ohlc')
- 4.21±0.4μs 3.46±0.2μs 0.82 index_cached_properties.IndexCache.time_inferred_type('TimedeltaIndex')
- 12.5±0.4ms 10.3±0.5ms 0.82 hash_functions.UniqueAndFactorizeArange.time_unique(9)
- 12.5±0.3ms 10.3±0.6ms 0.82 hash_functions.UniqueAndFactorizeArange.time_unique(6)
- 12.4±0.6ms 10.1±0.6ms 0.82 hash_functions.UniqueAndFactorizeArange.time_unique(7)
- 3.34±0.09ms 2.73±0.02ms 0.82 timeseries.DatetimeIndex.time_unique('tz_local')
- 7.91±0.6ms 6.47±0.4ms 0.82 algorithms.Duplicated.time_duplicated(False, False, 'datetime64[ns]')
- 12.4±0.5ms 10.1±0.5ms 0.82 hash_functions.UniqueAndFactorizeArange.time_unique(8)
- 104±3ms 84.8±2ms 0.82 join_merge.Align.time_series_align_int64_index
- 711±3ms 577±4ms 0.81 join_merge.I8Merge.time_i8merge('inner')
- 2.72±0.05ms 2.20±0.04ms 0.81 series_methods.IsInForObjects.time_isin_long_series_short_values
- 12.3±0.4ms 9.99±0.3ms 0.81 hash_functions.UniqueAndFactorizeArange.time_unique(11)
- 12.4±0.6ms 10.0±0.5ms 0.81 hash_functions.UniqueAndFactorizeArange.time_unique(4)
- 12.4±0.5ms 10.0±0.3ms 0.81 hash_functions.UniqueAndFactorizeArange.time_unique(10)
- 12.3±0.5ms 9.92±0.2ms 0.81 hash_functions.UniqueAndFactorizeArange.time_unique(13)
- 12.4±0.4ms 9.99±0.2ms 0.81 hash_functions.UniqueAndFactorizeArange.time_unique(12)
- 5.10±0.6ms 4.11±0.07ms 0.81 indexing.IntervalIndexing.time_getitem_scalar
- 6.23±0.06ms 5.02±0.08ms 0.81 algorithms.Factorize.time_factorize(False, False, 'uint')
- 9.38±0.3ms 7.51±0.2ms 0.80 hash_functions.NumericSeriesIndexingShuffled.time_loc_slice(<class 'pandas.core.indexes.numeric.Float64Index'>, 1000000)
- 9.83±0.5ms 7.83±0.4ms 0.80 algorithms.Factorize.time_factorize(False, False, 'datetime64[ns, tz]')
- 3.02±0.1ms 2.40±0.03ms 0.80 stat_ops.SeriesMultiIndexOps.time_op(1, 'mean')
- 12.5±0.6ms 9.88±0.2ms 0.79 hash_functions.UniqueAndFactorizeArange.time_unique(15)
- 72.5±2μs 57.4±0.7μs 0.79 indexing.NumericSeriesIndexing.time_getitem_scalar(<class 'pandas.core.indexes.numeric.Float64Index'>, 'nonunique_monotonic_inc')
- 12.5±0.9ms 9.92±0.3ms 0.79 hash_functions.UniqueAndFactorizeArange.time_unique(14)
- 8.22±0.3ms 6.51±0.2ms 0.79 algorithms.Duplicated.time_duplicated(False, False, 'datetime64[ns, tz]')
- 3.43±0.04ms 2.71±0.02ms 0.79 indexing.NumericSeriesIndexing.time_getitem_list_like(<class 'pandas.core.indexes.numeric.Float64Index'>, 'nonunique_monotonic_inc')
- 90.5±0.9ms 71.5±2ms 0.79 join_merge.Align.time_series_align_left_monotonic
- 4.24±0.4μs 3.35±0.1μs 0.79 index_cached_properties.IndexCache.time_values('TimedeltaIndex')
- 1.45±0.03ms 1.14±0.03ms 0.79 algorithms.Factorize.time_factorize(True, True, 'boolean')
- 6.96±0.09ms 5.43±0.1ms 0.78 hash_functions.IsinWithArange.time_isin(<class 'numpy.uint64'>, 1000, 0)
- 3.91±0.7μs 3.04±0.2μs 0.78 index_cached_properties.IndexCache.time_shape('DatetimeIndex')
- 7.48±0.3ms 5.83±0.2ms 0.78 hash_functions.IsinWithArange.time_isin(<class 'numpy.uint64'>, 8000, 0)
- 154±3μs 120±1μs 0.78 timedelta.TimedeltaIndexing.time_unique
- 9.42±0.1ms 7.30±0.06ms 0.77 index_object.SetOperations.time_operation('datetime', 'symmetric_difference')
- 4.52±0.2ms 3.47±0.2ms 0.77 algorithms.Duplicated.time_duplicated(False, False, 'int')
- 88.7±0.7ms 68.2±1ms 0.77 multiindex_object.Duplicated.time_duplicated
- 594±5μs 455±20μs 0.77 indexing.NumericSeriesIndexing.time_loc_slice(<class 'pandas.core.indexes.numeric.Float64Index'>, 'nonunique_monotonic_inc')
- 136±2ms 104±2ms 0.77 frame_methods.Duplicated.time_frame_duplicated
- 115±2ms 87.3±1ms 0.76 hash_functions.IsinWithRandomFloat.time_isin_outside(<class 'numpy.float64'>, 900000)
- 514±10μs 390±10μs 0.76 hash_functions.NumericSeriesIndexing.time_loc_slice(<class 'pandas.core.indexes.numeric.Float64Index'>, 1000000)
- 5.70±0.07ms 4.31±0.1ms 0.75 algorithms.Factorize.time_factorize(False, False, 'boolean')
- 6.82±0.2ms 5.14±0.09ms 0.75 algorithms.Factorize.time_factorize(False, True, 'boolean')
- 5.04±0.1ms 3.80±0.08ms 0.75 algorithms.Factorize.time_factorize(True, False, 'datetime64[ns, tz]')
- 5.16±0.2ms 3.85±0.07ms 0.75 algorithms.Factorize.time_factorize(True, False, 'datetime64[ns]')
- 525±3μs 390±4μs 0.74 indexing.NumericSeriesIndexing.time_getitem_slice(<class 'pandas.core.indexes.numeric.Float64Index'>, 'nonunique_monotonic_inc')
- 85.4±1ms 63.4±0.5ms 0.74 series_methods.IsInFloat64.time_isin_many_different
- 6.37±0.08ms 4.69±0.09ms 0.74 hash_functions.IsinWithArange.time_isin(<class 'numpy.uint64'>, 1000, -2)
- 7.23±0.7ms 5.31±0.08ms 0.73 hash_functions.IsinWithArange.time_isin(<class 'numpy.uint64'>, 2000, 0)
- 135±1ms 98.6±1ms 0.73 hash_functions.IsinWithArangeSorted.time_isin(<class 'numpy.float64'>, 1000000)
- 1.19±0.04ms 864±10μs 0.73 algorithms.Factorize.time_factorize(True, False, 'boolean')
- 124±2ms 90.2±0.9ms 0.73 hash_functions.IsinWithRandomFloat.time_isin(<class 'numpy.float64'>, 900000)
- 157±0.8μs 114±2μs 0.73 indexing.NumericSeriesIndexing.time_loc_scalar(<class 'pandas.core.indexes.numeric.Float64Index'>, 'nonunique_monotonic_inc')
- 5.89±0.06ms 4.25±0.1ms 0.72 hash_functions.IsinWithArange.time_isin(<class 'numpy.int64'>, 8000, 0)
- 5.59±0.03ms 4.03±0.04ms 0.72 hash_functions.IsinWithArange.time_isin(<class 'numpy.int64'>, 1000, 0)
- 2.99±0.05ms 2.15±0.02ms 0.72 indexing.NumericSeriesIndexing.time_loc_list_like(<class 'pandas.core.indexes.numeric.Float64Index'>, 'nonunique_monotonic_inc')
- 4.49±0.05ms 3.23±0.04ms 0.72 algorithms.Duplicated.time_duplicated(False, False, 'uint')
- 583±3ms 412±6ms 0.71 hash_functions.NumericSeriesIndexingShuffled.time_loc_slice(<class 'pandas.core.indexes.numeric.Float64Index'>, 5000000)
- 47.8±0.3ms 33.8±0.3ms 0.71 series_methods.IsInFloat64.time_isin_few_different
- 47.8±0.4ms 33.7±0.2ms 0.71 series_methods.IsInFloat64.time_isin_nan_values
- 5.67±0.06ms 3.99±0.05ms 0.70 hash_functions.IsinWithArange.time_isin(<class 'numpy.int64'>, 2000, 0)
- 590±2ms 409±5ms 0.69 hash_functions.NumericSeriesIndexing.time_loc_slice(<class 'pandas.core.indexes.numeric.Float64Index'>, 5000000)
- 981±30μs 649±10μs 0.66 timeseries.DatetimeIndex.time_unique('repeated')
- 4.99±0.04ms 3.29±0.07ms 0.66 hash_functions.IsinWithArange.time_isin(<class 'numpy.int64'>, 1000, -2)
- 8.27±1μs 5.36±0.4μs 0.65 index_cached_properties.IndexCache.time_shape('TimedeltaIndex')
- 22.5±0.3ms 14.3±0.2ms 0.63 hash_functions.IsinWithArange.time_isin(<class 'numpy.uint64'>, 8000, 2)
- 12.7±0.3ms 7.98±0.05ms 0.63 index_object.IntervalIndexMethod.time_intersection_one_duplicate(100000)
- 23.4±0.7ms 14.3±0.1ms 0.61 hash_functions.IsinWithArange.time_isin(<class 'numpy.uint64'>, 8000, -2)
- 21.3±0.3ms 12.8±0.05ms 0.60 hash_functions.IsinWithArange.time_isin(<class 'numpy.int64'>, 8000, 2)
- 21.4±0.2ms 12.8±0.06ms 0.60 hash_functions.IsinWithArange.time_isin(<class 'numpy.int64'>, 8000, -2)
- 10.7±0.2ms 6.29±0.1ms 0.59 index_object.IntervalIndexMethod.time_intersection(100000)
- 26.1±0.5ms 6.69±0.07ms 0.26 hash_functions.IsinWithArange.time_isin(<class 'numpy.uint64'>, 2000, 2)
- 26.2±0.5ms 6.60±0.4ms 0.25 hash_functions.IsinWithArange.time_isin(<class 'numpy.uint64'>, 2000, -2)
- 24.3±0.4ms 5.52±0.3ms 0.23 hash_functions.IsinWithArange.time_isin(<class 'numpy.int64'>, 2000, 2)
- 24.6±0.3ms 5.08±0.02ms 0.21 hash_functions.IsinWithArange.time_isin(<class 'numpy.int64'>, 2000, -2)