DataFrame.asof() : Timezone Awareness / Naivety comparison TypeError (incorrect) · Issue #21194 · pandas-dev/pandas (original) (raw)

import pandas as pd

# Create some random Timestamps
timestamp1 = pd.Timestamp('2018-01-01 21:00:05.001+00:00')
timestamp2 = pd.Timestamp('2018-01-01 22:35:10.550+00:00')

# Get an internal timestamp, so asof should give us the lesser value
timestamp_internal = timestamp2 + ((timestamp2 - timestamp1) / 2)

# Create a DataFrame
df = pd.DataFrame(data=[1,2], index=[timestamp1, timestamp2])

# display it
df
                                  0
2018-01-01 21:00:05.001000+00:00  1
2018-01-01 22:35:10.550000+00:00  2

# Now call asof() and show the issue
df.asof(timestamp_internal)
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "C:\Anaconda3\envs\py36\lib\site-packages\pandas\core\generic.py", line 6144, in asof
    locs = self.index.asof_locs(where, ~(nulls.values))
  File "C:\Anaconda3\envs\py36\lib\site-packages\pandas\core\indexes\base.py", line 2489, in asof_locs
    result[(locs == 0) & (where < self.values[first])] = -1
  File "C:\Anaconda3\envs\py36\lib\site-packages\pandas\core\indexes\datetimes.py", line 136, in wrapper
    self._assert_tzawareness_compat(other)
  File "C:\Anaconda3\envs\py36\lib\site-packages\pandas\core\indexes\datetimes.py", line 672, in _assert_tzawareness_compat
    raise TypeError('Cannot compare tz-naive and tz-aware '
TypeError: Cannot compare tz-naive and tz-aware datetime-like objects

# Demonstrate other errant behavior - Try using a tz-naive Timestamp to lookup the frame
df.asof(timestamp_innie.tz_localize(None))
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "C:\Anaconda3\envs\py36\lib\site-packages\pandas\core\generic.py", line 6108, in asof
    if where < start:
  File "pandas\_libs\tslibs\timestamps.pyx", line 164, in pandas._libs.tslibs.timestamps._Timestamp.__richcmp__
  File "pandas\_libs\tslibs\timestamps.pyx", line 224, in pandas._libs.tslibs.timestamps._Timestamp._assert_tzawareness_compat
TypeError: Cannot compare tz-naive and tz-aware timestamps

Problem description

Somehow the DataFrame index is losing the 'awareness' of the timezone of the original Timestamps.
This has only been noticed recently, following a recent upgrade to the newest version of pandas, however I cannot say for sure whether or not it has been a very recent change to pandas which has caused this.

I am confident (though not 100% certain) that the expected usage worked previously (nor am in a position to test right now).

EDIT: Have tested this with pandas=0.22.0 and the expected output is now working fine.

Hopefully the bug is reproducible.

Expected Output

0 1
Name: 2018-01-01 23:22:43.324500+0000, dtype: int64

Output of pd.show_versions()

[paste the output of pd.show_versions() here below this line]

pd.show_versions()

Matplotlib support failed
INSTALLED VERSIONS

commit: None
python: 3.6.5.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 62 Stepping 4, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None
pandas: 0.23.0
pytest: None
pip: 10.0.1
setuptools: 39.1.0
Cython: 0.28.2
numpy: 1.13.3
scipy: 1.1.0
pyarrow: 0.7.0
xarray: 0.10.4
IPython: 6.4.0
sphinx: 1.7.4
patsy: 0.5.0
dateutil: 2.7.3
pytz: 2018.4
blosc: None
bottleneck: None
tables: 3.4.3
numexpr: 2.6.5
feather: 0.4.0
matplotlib: 2.2.2
openpyxl: 2.5.3
xlrd: 1.1.0
xlwt: None
xlsxwriter: 1.0.4
lxml: 4.2.1
bs4: 4.6.0
html5lib: 0.9999999
sqlalchemy: 1.2.7
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None