Have pd.array infer new extension types · Issue #29791 · pandas-dev/pandas (original) (raw)

Currently pd.array sometimes requires an explicit dtype=... to get one of our extension arrays (we'll infer for Period, Datetime, and Interval).

This proposal is to have it infer the extension type for

All of these currently return PandasArray.

Concretely, we'll need to teach infer_dtype how not to infer mixed for a mix of strings / booleans and NA values, similar to how it handles integer-na

In [27]: lib.infer_dtype([True, None], skipna=False) Out[27]: 'mixed'

In [28]: lib.infer_dtype(['a', None], skipna=False) Out[28]: 'mixed'

In [29]: lib.infer_dtype([0, np.nan], skipna=False) Out[29]: 'integer-na'

and then handle those in array.