BUG: Series.dt.tz accessor returns str instead of tzinfo when using PyArrow timestamps · Issue #55003 · pandas-dev/pandas (original) (raw)
Pandas version checks
- I have checked that this issue has not already been reported.
- I have confirmed this bug exists on the latest version of pandas.
- I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import pandas as pd series = pd.Series(pd.to_datetime(["2023-04-05T00:00:00+00"]), dtype="timestamp[ns, tz=UTC][pyarrow]") series.dt.tz
OUTPUT: 'UTC'
Issue Description
According to pandas documentation the dt.tz accessor of series should return the timezone as either datetime.tzinfo
, pytz.tzinfo.BaseTZInfo
, dateutil.tz.tz.tzfile
, or None
.
If you try to use this accessor on pyarrow-based timestamps, you'll get a str
instead. str
cannot be used with common datetime.datetime
methods such as datetime.now(tz)
, which require a datetime.tzinfo
.
Expected Behavior
import pandas as pd series = pd.Series(pd.to_datetime(["2023-04-05T00:00:00+00"]), dtype="timestamp[ns, tz=UTC][pyarrow]") series.dt.tz
EXPECTED OUTPUT: datetime.timezone.utc
Just like the following snippet on standard numpy-based datetimes:
import pandas as pd series = pd.Series(pd.to_datetime(["2023-04-05T00:00:00+00"])) series.dt.tz
Installed Versions
INSTALLED VERSIONS
commit : ba1cccd
python : 3.11.5.final.0
python-bits : 64
OS : Linux
OS-release : 5.14.0-1054-oem
Version : #61-Ubuntu SMP Fri Oct 14 13:05:50 UTC 2022
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 2.1.0
numpy : 1.24.1
pytz : 2022.7
dateutil : 2.8.2
setuptools : 68.1.2
pip : 23.1.2
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : 1.1
pymysql : None
psycopg2 : 2.9.5
jinja2 : 3.1.2
IPython : 8.11.0
pandas_datareader : None
bs4 : 4.9.3
bottleneck : 1.3.7
dataframe-api-compat: None
fastparquet : None
fsspec : 2022.11.0
gcsfs : None
matplotlib : 3.7.1
numba : 0.57.1
numexpr : 2.8.5
odfpy : None
openpyxl : 3.0.9
pandas_gbq : None
pyarrow : 13.0.0
pyreadstat : None
pyxlsb : None
s3fs : 2022.11.0
scipy : 1.11.2
sqlalchemy : 1.4.49
tables : None
tabulate : None
xarray : None
xlrd : 1.2.0
zstandard : None
tzdata : 2023.3
qtpy : None
pyqt5 : None