PERF: Series.to_numpy with float dtype and na_value=np.nan by lukemanley · Pull Request #52430 · pandas-dev/pandas (original) (raw)

Avoids filling na values with np.nan if the array is already a numpy float dtype. (DataFrame has a similar optimization in BlockManager.as_array)

import pandas as pd
import numpy as np

ser = pd.Series(np.random.randn(10_000_000))

%timeit ser.to_numpy(dtype="float64", na_value=np.nan, copy=False)

# 22.5 ms ± 2.56 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)          -> main
# 1.97 µs ± 24.5 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)   -> PR
       before           after         ratio
     [3ce07cb4]       [020fb14a]
     <main>           <series-to-numpy-float-nan>
-     1.02±0.02ms      2.12±0.03μs     0.00  series_methods.ToNumpy.time_to_numpy_float_with_nan

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE INCREASED.