BUG: join operation fails on overlapping IntervalIndex levels · Issue #45661 · pandas-dev/pandas (original) (raw)
Pandas version checks
- I have checked that this issue has not already been reported.
- I have confirmed this bug exists on the latest version of pandas.
- I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import pandas as pd
range_index = pd.RangeIndex(3, name="range_index")
interval_index = pd.IntervalIndex.from_tuples([ (0.0, 1.0), (1.0, 2.0), (1.5, 2.5) ], name='interval_index')
multi_index = pd.MultiIndex.from_product([interval_index, range_index])
print(interval_index.join(multi_index))
This causes the same issue
print(multi_index.join(interval_index))
Issue Description
Observed output:
Traceback (most recent call last):
File "/home/jmu3si/tmp/join_index_flipped.py", line 11, in <module>
print(interval_index.join(multi_index))
File "/home/jmu3si/miniconda3/envs/myroot/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 216, in join
join_index, lidx, ridx = meth(self, other, how=how, level=level, sort=sort)
File "/home/jmu3si/miniconda3/envs/myroot/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 4368, in join
return self._join_multi(other, how=how)
File "/home/jmu3si/miniconda3/envs/myroot/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 4531, in _join_multi
result = self._join_level(other, level, how=how)
File "/home/jmu3si/miniconda3/envs/myroot/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 4633, in _join_level
new_level, left_lev_indexer, right_lev_indexer = old_level.join(
File "/home/jmu3si/miniconda3/envs/myroot/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 216, in join
join_index, lidx, ridx = meth(self, other, how=how, level=level, sort=sort)
File "/home/jmu3si/miniconda3/envs/myroot/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 4426, in join
return self._join_via_get_indexer(other, how, sort)
File "/home/jmu3si/miniconda3/envs/myroot/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 4456, in _join_via_get_indexer
lindexer = self.get_indexer(join_index)
File "/home/jmu3si/miniconda3/envs/myroot/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3721, in get_indexer
raise InvalidIndexError(self._requires_unique_msg)
pandas.errors.InvalidIndexError: cannot handle overlapping indices; use IntervalIndex.get_indexer_non_unique
The join operation fails, because the get_indexer()
fails due to overlapping intervals. It is very similar to #44096. The difference is probably that in here it is not two MultiIndex
s that we are trying to join.
Expected Behavior
Expected output:
MultiIndex([((0.0, 1.0], 0),
((0.0, 1.0], 1), ((0.0, 1.0], 2),
((1.0, 2.0], 0), ((1.0, 2.0], 1),
((1.0, 2.0], 2), ((1.5, 2.5], 0),
((1.5, 2.5], 1), ((1.5, 2.5], 2)],
names=['interval_index', 'range_index'])
MultiIndex([((0.0, 1.0], 0),
((0.0, 1.0], 1), ((0.0, 1.0], 2),
((1.0, 2.0], 0),
((1.0, 2.0], 1), ((1.0, 2.0], 2),
((1.5, 2.5], 0), ((1.5, 2.5], 1),
((1.5, 2.5], 2)],
names=['interval_index', 'range_index'])
Installed Versions
INSTALLED VERSIONS ------------------ commit : bb1f651python : 3.9.7.final.0 python-bits : 64 OS : Linux OS-release : 5.4.0-96-lowlatency Version : #109-Ubuntu SMP PREEMPT Wed Jan 12 17:51:01 UTC 2022 machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : de_DE.UTF-8 LOCALE : de_DE.UTF-8
pandas : 1.4.0
numpy : 1.19.5
pytz : 2021.1
dateutil : 2.8.2
pip : 21.2.4
setuptools : 58.0.4
Cython : 0.29.24
pytest : None
hypothesis : None
sphinx : 4.2.0
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.0.1
IPython : None
pandas_datareader: None
bs4 : None
bottleneck : None
fastparquet : None
fsspec : None
gcsfs : None
matplotlib : 3.5.0
numba : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pyreadstat : None
pyxlsb : None
s3fs : None
scipy : 1.7.3
sqlalchemy : None
tables : None
tabulate : None
xarray : 0.20.1
xlrd : None
xlwt : None
zstandard : None