merge_asof with one tz-aware datetime "by" parameter and another parameter raises · Issue #26649 · pandas-dev/pandas (original) (raw)
Code Sample, a copy-pastable example if possible
import pandas as pd
left = pd.DataFrame({ 'by_col1': pd.DatetimeIndex(['2018-01-01']).tz_localize('UTC'), 'by_col2': ['HELLO'], 'on_col': [2], 'value': ['a']}) right = pd.DataFrame({ 'by_col1': pd.DatetimeIndex(['2018-01-01']).tz_localize('UTC'), 'by_col2': ['WORLD'], 'on_col': [1], 'value': ['b']}) pd.merge_asof(left, right, by=['by_col1', 'by_col2'], on='on_col')
Problem description
This is very similar to: #21184
The only difference is that the merge_asof
by
is made of 2 columns (instead of one):
- one is tz-aware
- the other one is something else (string, number etc...)
When running this, I get:
Traceback (most recent call last):
File "test.py", line 13, in <module>
pd.merge_asof(left, right, by=['by_col1', 'by_col2'], on='on_col')
File "myenv/lib/python3.6/site-packages/pandas/core/reshape/merge.py", line 462, in merge_asof
return op.get_result()
File "myenv/lib/python3.6/site-packages/pandas/core/reshape/merge.py", line 1256, in get_result
join_index, left_indexer, right_indexer = self._get_join_info()
File "myenv/lib/python3.6/site-packages/pandas/core/reshape/merge.py", line 756, in _get_join_info
right_indexer) = self._get_join_indexers()
File "myenv/lib/python3.6/site-packages/pandas/core/reshape/merge.py", line 1504, in _get_join_indexers
left_by_values = flip(left_by_values)
File "myenv/lib/python3.6/site-packages/pandas/core/reshape/merge.py", line 1457, in flip
return np.array(lzip(*xs), labeled_dtypes)
File "myenv/lib/python3.6/site-packages/pandas/core/dtypes/dtypes.py", line 150, in __repr__
return str(self)
File "myenv/lib/python3.6/site-packages/pandas/core/dtypes/dtypes.py", line 129, in __str__
return self.__unicode__()
File "myenv/lib/python3.6/site-packages/pandas/core/dtypes/dtypes.py", line 704, in __unicode__
return "datetime64[{unit}, {tz}]".format(unit=self.unit, tz=self.tz)
SystemError: PyEval_EvalFrameEx returned a result with an error set
Expected Output
I expect the merge_asof to work, and pick up the by column accordingly
Output of 0.24.2
[paste the output of ``pd.show_versions()`` here below this line]
INSTALLED VERSIONS
commit: None
python: 3.6.3.final.0
python-bits: 64
OS: Linux
OS-release: 3.10.0-862.3.3.el7.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
pandas: 0.24.2
pytest: 4.5.0
pip: 19.1.1
setuptools: 40.8.0
Cython: 0.28.5
numpy: 1.16.4
scipy: 1.1.0
pyarrow: 0.12.1
xarray: None
IPython: 7.3.0
sphinx: 1.4.6
patsy: 0.5.1
dateutil: 2.8.0
pytz: 2019.1
blosc: None
bottleneck: 1.2.1
tables: 3.4.4
numexpr: 2.6.9
feather: None
matplotlib: 3.0.2
openpyxl: 2.5.3
xlrd: None
xlwt: None
xlsxwriter: None
lxml.etree: 4.3.1
bs4: None
html5lib: None
sqlalchemy: 1.2.18
pymysql: None
psycopg2: 2.7.1 (dt dec pq3 ext lo64)
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: 0.7.0
gcsfs: None