PERF: constructing string Series by topper-123 · Pull Request #36317 · pandas-dev/pandas (original) (raw)

Avoid needless call to lib.infer_dtype, when string dtype.

Performance example:

x = np.array([str(u) for u in range(1_000_000)], dtype=object) %timeit pd.Series(x, dtype=str) 344 ms ± 59.7 ms per loop # v1.1.0 157 ms ± 7.04 ms per loop # after #35519 22.6 ms ± 191 µs per loop # after #36304 11.2 ms ± 48.6 µs per loop # after this PR

Similar speed-up is possible for pd.Series(x, dtype="string"), but requires some refactorng of StringArray, so I'll do that is a seperate PR.

xref #35519 & #36304.