Reindexing two tz-aware (UTC) indices gives 'DatetimeArray subtraction must have the same timezones or no timezones' · Issue #32740 · pandas-dev/pandas (original) (raw)

Code Sample, a copy-pastable example if possible

import pandas as pd import datetime as dt

print(pd.version) a = pd.date_range('2010-01-01', '2010-01-02', periods=24, tz='utc') b = pd.date_range('2010-01-01', '2010-01-02', periods=23, tz='utc') a.reindex(b, method='nearest', tolerance=dt.timedelta(seconds=20))

Problem description

Output is:

Python 3.7.4 (default, Jul 27 2019, 21:25:02) 
[GCC 7.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> import datetime as dt
>>> 
>>> print(pd.__version__)
1.0.2
>>> a = pd.date_range('2010-01-01', '2010-01-02', periods=24, tz='utc')
>>> b = pd.date_range('2010-01-01', '2010-01-02', periods=23, tz='utc')
>>> a.reindex(b, method='nearest', tolerance=dt.timedelta(seconds=20))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/andrew/project/ch2/choochoo-1/py/env/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 3144, in reindex
    target, method=method, limit=limit, tolerance=tolerance
  File "/home/andrew/project/ch2/choochoo-1/py/env/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 2740, in get_indexer
    indexer = self._get_nearest_indexer(target, limit, tolerance)
  File "/home/andrew/project/ch2/choochoo-1/py/env/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 2830, in _get_nearest_indexer
    indexer = self._filter_indexer_tolerance(target, indexer, tolerance)
  File "/home/andrew/project/ch2/choochoo-1/py/env/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 2834, in _filter_indexer_tolerance
    distance = abs(self.values[indexer] - target)
  File "/home/andrew/project/ch2/choochoo-1/py/env/lib/python3.7/site-packages/pandas/core/indexes/extension.py", line 147, in method
    result = meth(_maybe_unwrap_index(other))
  File "/home/andrew/project/ch2/choochoo-1/py/env/lib/python3.7/site-packages/pandas/core/arrays/datetimelike.py", line 1458, in __rsub__
    return -(self - other)
  File "/home/andrew/project/ch2/choochoo-1/py/env/lib/python3.7/site-packages/pandas/core/ops/common.py", line 64, in new_method
    return method(self, other)
  File "/home/andrew/project/ch2/choochoo-1/py/env/lib/python3.7/site-packages/pandas/core/arrays/datetimelike.py", line 1406, in __sub__
    result = self._sub_datetime_arraylike(other)
  File "/home/andrew/project/ch2/choochoo-1/py/env/lib/python3.7/site-packages/pandas/core/arrays/datetimes.py", line 667, in _sub_datetime_arraylike
    f"{type(self).__name__} subtraction must have the same "
TypeError: DatetimeArray subtraction must have the same timezones or no timezones

The problem is that I don't see how to reindex and / or the error message appears to be nonsensical since both have UTC timezones.

Expected Output

With 0.25.3 I see:

(DatetimeIndex([          '2010-01-01 00:00:00+00:00',
               '2010-01-01 01:05:27.272727272+00:00',
               '2010-01-01 02:10:54.545454545+00:00',
               '2010-01-01 03:16:21.818181818+00:00',
               '2010-01-01 04:21:49.090909090+00:00',
               '2010-01-01 05:27:16.363636363+00:00',
               '2010-01-01 06:32:43.636363636+00:00',
               '2010-01-01 07:38:10.909090909+00:00',
               '2010-01-01 08:43:38.181818181+00:00',
               '2010-01-01 09:49:05.454545454+00:00',
               '2010-01-01 10:54:32.727272727+00:00',
                         '2010-01-01 12:00:00+00:00',
               '2010-01-01 13:05:27.272727272+00:00',
               '2010-01-01 14:10:54.545454545+00:00',
               '2010-01-01 15:16:21.818181818+00:00',
               '2010-01-01 16:21:49.090909090+00:00',
               '2010-01-01 17:27:16.363636363+00:00',
               '2010-01-01 18:32:43.636363636+00:00',
               '2010-01-01 19:38:10.909090909+00:00',
               '2010-01-01 20:43:38.181818181+00:00',
               '2010-01-01 21:49:05.454545454+00:00',
               '2010-01-01 22:54:32.727272727+00:00',
                         '2010-01-02 00:00:00+00:00'],
              dtype='datetime64[ns, UTC]', freq=None), array([ 0, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
       -1, -1, -1, -1, -1, 23]))

Output of pd.show_versions()

[paste the output of pd.show_versions() here below this line]

INSTALLED VERSIONS

commit : None
python : 3.7.4.final.0
python-bits : 64
OS : Linux
OS-release : 4.12.14-lp151.28.36-default
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_GB.UTF-8
LOCALE : en_GB.UTF-8

pandas : 1.0.2
numpy : 1.18.1
pytz : 2019.3
dateutil : 2.8.1
pip : 20.0.2
setuptools : 40.8.0
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 2.11.1
IPython : 7.13.0
pandas_datareader: None
bs4 : None
bottleneck : None
fastparquet : None
gcsfs : None
lxml.etree : None
matplotlib : 3.2.0
numexpr : None
odfpy : None
openpyxl : 3.0.3
pandas_gbq : None
pyarrow : None
pytables : None
pytest : None
pyxlsb : None
s3fs : None
scipy : 1.4.1
sqlalchemy : 1.3.15
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
xlsxwriter : None
numba : None