Merge error on Categorical Interval columns · Issue #28668 · pandas-dev/pandas (original) (raw)
Failure on merging on Categorical columns which include intervals.
For instance, the following raises TypeError: data type not understood
bins = np.arange(0, 91, 30) df1 = pd.DataFrame(np.array([[1, 22], [2, 35], [3, 82]]), columns=['Id', 'Dist']).set_index('Id')
df1['DistGroup'] = pd.cut(df1['Dist'], bins)
idx = pd.IntervalIndex.from_breaks(bins) df2 = pd.DataFrame(np.array(['g1', 'g2', 'g3']), columns=['GroupId'], index=idx) df2.index.name = 'DistGroup'
res = pd.merge(df1, df2, left_on='DistGroup', right_index=True).reset_index()
Expected Output
Dist | DistGroup | GroupId | |
---|---|---|---|
0 | 22 | (0, 30] | g1 |
1 | 35 | (30, 60] | g2 |
2 | 82 | (60, 90] | g3 |
'
Output of pd.show_versions()
[paste the output of pd.show_versions()
here below this line]
INSTALLED VERSIONS
commit : None
python : 3.6.9.final.0
python-bits : 64
OS : Windows
OS-release : 10
machine : AMD64
processor : Intel64 Family 6 Model 60 Stepping 3, GenuineIntel
byteorder : little
LC_ALL : None
LANG : None
LOCALE : None.None
pandas : 0.25.1
numpy : 1.16.5
pytz : 2019.2
dateutil : 2.8.0
pip : 19.2.2
setuptools : 41.0.1
Cython : 0.29.13
pytest : 5.0.1
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.3.3
html5lib : 1.0.1
pymysql : None
psycopg2 : 2.7.6.1 (dt dec pq3 ext lo64)
jinja2 : 2.10.1
IPython : 7.8.0
pandas_datareader: None
bs4 : None
bottleneck : None
fastparquet : None
gcsfs : None
lxml.etree : 4.3.3
matplotlib : 3.1.1
numexpr : 2.7.0
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pytables : None
s3fs : None
scipy : 1.3.1
sqlalchemy : 1.3.7
tables : 3.5.2
xarray : None
xlrd : None
xlwt : None
xlsxwriter : None