merge_asof with one tz-aware datetime "by" parameter and another parameter raises · Issue #26649 · pandas-dev/pandas (original) (raw)

Code Sample, a copy-pastable example if possible

import pandas as pd

left = pd.DataFrame({ 'by_col1': pd.DatetimeIndex(['2018-01-01']).tz_localize('UTC'), 'by_col2': ['HELLO'], 'on_col': [2], 'value': ['a']}) right = pd.DataFrame({ 'by_col1': pd.DatetimeIndex(['2018-01-01']).tz_localize('UTC'), 'by_col2': ['WORLD'], 'on_col': [1], 'value': ['b']}) pd.merge_asof(left, right, by=['by_col1', 'by_col2'], on='on_col')

Problem description

This is very similar to: #21184

The only difference is that the merge_asof by is made of 2 columns (instead of one):

When running this, I get:

Traceback (most recent call last):
  File "test.py", line 13, in <module>
    pd.merge_asof(left, right, by=['by_col1', 'by_col2'], on='on_col')
  File "myenv/lib/python3.6/site-packages/pandas/core/reshape/merge.py", line 462, in merge_asof
    return op.get_result()
  File "myenv/lib/python3.6/site-packages/pandas/core/reshape/merge.py", line 1256, in get_result
    join_index, left_indexer, right_indexer = self._get_join_info()
  File "myenv/lib/python3.6/site-packages/pandas/core/reshape/merge.py", line 756, in _get_join_info
    right_indexer) = self._get_join_indexers()
  File "myenv/lib/python3.6/site-packages/pandas/core/reshape/merge.py", line 1504, in _get_join_indexers
    left_by_values = flip(left_by_values)
  File "myenv/lib/python3.6/site-packages/pandas/core/reshape/merge.py", line 1457, in flip
    return np.array(lzip(*xs), labeled_dtypes)
  File "myenv/lib/python3.6/site-packages/pandas/core/dtypes/dtypes.py", line 150, in __repr__
    return str(self)
  File "myenv/lib/python3.6/site-packages/pandas/core/dtypes/dtypes.py", line 129, in __str__
    return self.__unicode__()
  File "myenv/lib/python3.6/site-packages/pandas/core/dtypes/dtypes.py", line 704, in __unicode__
    return "datetime64[{unit}, {tz}]".format(unit=self.unit, tz=self.tz)
SystemError: PyEval_EvalFrameEx returned a result with an error set

Expected Output

I expect the merge_asof to work, and pick up the by column accordingly

Output of 0.24.2

[paste the output of ``pd.show_versions()`` here below this line]

INSTALLED VERSIONS

commit: None
python: 3.6.3.final.0
python-bits: 64
OS: Linux
OS-release: 3.10.0-862.3.3.el7.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.24.2
pytest: 4.5.0
pip: 19.1.1
setuptools: 40.8.0
Cython: 0.28.5
numpy: 1.16.4
scipy: 1.1.0
pyarrow: 0.12.1
xarray: None
IPython: 7.3.0
sphinx: 1.4.6
patsy: 0.5.1
dateutil: 2.8.0
pytz: 2019.1
blosc: None
bottleneck: 1.2.1
tables: 3.4.4
numexpr: 2.6.9
feather: None
matplotlib: 3.0.2
openpyxl: 2.5.3
xlrd: None
xlwt: None
xlsxwriter: None
lxml.etree: 4.3.1
bs4: None
html5lib: None
sqlalchemy: 1.2.18
pymysql: None
psycopg2: 2.7.1 (dt dec pq3 ext lo64)
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: 0.7.0
gcsfs: None