REGR: casting datetime strings with offzet to tz-naive datetime64 fails · Issue #50140 · pandas-dev/pandas (original) (raw)
On pandas 1.5 (so no deprecation warning):
pd.Index(['2021-01-01 00:00:00+01:00', '2021-01-02 00:00:00+01:00']).astype("datetime64[ns]") DatetimeIndex(['2020-12-31 23:00:00', '2021-01-01 23:00:00'], dtype='datetime64[ns]', freq=None)
On the main branch:
pd.Index(['2021-01-01 00:00:00+01:00', '2021-01-02 00:00:00+01:00']).astype("datetime64[ns]") ... TypeError: Cannot use .astype to convert from timezone-aware dtype to timezone-naive dtype. Use obj.tz_localize(None) or obj.tz_convert('UTC').tz_localize(None) instead.
This started to fail a few days ago on pyarrow's CI. This comes up if you roundtrip a pandas DataFrame where the columns are a tz-aware DatetimeIndex (in Arrow they will be string columns, and then in the arrow->pandas conversion we try to cast the string to datetime64 and then localize. We should probably directly cast to the tz-aware dtype).
From a quick look at the recent commits and the code path this takes, might be caused by #50015 (cc @jbrockmendel)