merge_asof() must be able to operate with timezone-aware DatetimeIndex · Issue #14844 · pandas-dev/pandas (original) (raw)
I can perform the following merge just fine:
left = pd.DataFrame({'date': pd.DatetimeIndex(start=pd.to_datetime('2016-01-02'), freq='D', periods=5), 'value1':np.arange(5)}) right = pd.DataFrame({'date': pd.DatetimeIndex(start=pd.to_datetime('2016-01-01'), freq='D', periods=5), 'value2':list("ABCDE")}) pd.merge_asof(left, right, on='date', tolerance=pd.Timedelta('1 day'))
However, adding a timezone to the DatetimeIndex doesn't work:
import pytz left = pd.DataFrame({'date': pd.DatetimeIndex(start=pd.to_datetime('2016-01-02'), freq='D', periods=5, tz=pytz.timezone('UTC')), 'value1':np.arange(5)}) right = pd.DataFrame({'date': pd.DatetimeIndex(start=pd.to_datetime('2016-01-01'), freq='D', periods=5, tz=pytz.timezone('UTC')), 'value2':list("ABCDE")}) pd.merge_asof(left, right, on='date', tolerance=pd.Timedelta('1 day'))
I get the oddly worded
MergeError: incompatible tolerance, must be compat with type <class 'pandas.tseries.index.DatetimeIndex'>
The solution is actually very simple. _AsOfMerge._get_merge_keys()
needs to check for is_datetime64tz_dtype()
in addition to is_datetime64_dtype()
when there is a tolerance. I should also fix the error message to be clearer.