PERF: construct DataFrame with string array and dtype=str by topper-123 · Pull Request #36432 · pandas-dev/pandas (original) (raw)

Avoid inefficient call to arr.astype() when dtype is str, and use ensure_string_array instead.

Performance example:

x = np.array([str(u) for u in range(1_000_000)], dtype=object).reshape(500_000, 2) %timeit pd.DataFrame(x, dtype=str) 391 ms ± 17.7 ms per loop # master 11.9 ms ± 131 µs per loop # after this PR

xref #35519, #36304 & #36317.