BUG: CategoricalIndex allowed reindexing duplicate sources by batterseapower · Pull Request #28257 · pandas-dev/pandas (original) (raw)
Had to fix a couple of incidental Pandas bugs that were surfaced by the main fix.
pd.Index([1, 1, 0, 2, 2]).union(pd.Index([0, 2, -1]))
would fail with an error about duplicates in the index. We now returnpd.Index([-1, ,0 1, 1, 2, 2])
(orpd.Index([1, 1, 0, 2, 2, -1])
ifsort=False
) which is more consistent with the behaviour ofIndex.intersection
pd.Index(['A', 'B']).get_indexer_non_unique(pd.Index([0]))
would fail, complaining with aTypeError
about'<' not supported between instances of 'str' and 'int'
. This is caused by over-aggressive use ofsearchsorted
, solved by falling back to linear search