PERF: Series.to_numpy with float dtype and na_value=np.nan by lukemanley · Pull Request #52430 · pandas-dev/pandas (original) (raw)
Avoids filling na values with np.nan
if the array is already a numpy float dtype. (DataFrame
has a similar optimization in BlockManager.as_array
)
import pandas as pd
import numpy as np
ser = pd.Series(np.random.randn(10_000_000))
%timeit ser.to_numpy(dtype="float64", na_value=np.nan, copy=False)
# 22.5 ms ± 2.56 ms per loop (mean ± std. dev. of 7 runs, 10 loops each) -> main
# 1.97 µs ± 24.5 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each) -> PR
before after ratio
[3ce07cb4] [020fb14a]
<main> <series-to-numpy-float-nan>
- 1.02±0.02ms 2.12±0.03μs 0.00 series_methods.ToNumpy.time_to_numpy_float_with_nan
SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE INCREASED.