Confusing "MergeError: incompatible merge keys [1] category and category, must be the same type" · Issue #26136 · pandas-dev/pandas (original) (raw)

I'm using pd.merges_asof() to merge two dataframes together. The by keys are categoricals, but not equal, apparently (one has a few values more than the other). It took me a little bit to figure that out, however, because the error message for this was a bit confusing: "incompatible merge keys [1] category and category, must be the same type". It'd be nice if it was clearer. I can do the pull request, once we figure out how to make this better.

I see the following possible solutions:

  1. Special-case the error message for dtypes that take parameters and so are not necessarily all equal. Something a bit like: "incompatible merge keys: both are category, but not equal ones" (Easiest solution.)
  2. Make a nicer error message for categories, by digging a little bit into what makes them not equal. Something like "incompatible merge keys: both categories, but the left one has 3 levels more", or "but they have different levels: ..."
  3. Change the __str__ method for CategoricalDType to print something a bit more informative than "category". (No guarantee though that what we would print would always allow people to distinguish two not-equal categoricals as not equal.. unless we were to print out all the levels.)

Anything that sounds good here?