COMPAT: unique() should preserve the dtype of the input by stuarteberg · Pull Request #27874 · pandas-dev/pandas (original) (raw)

The referenced issue seems to imply that the trouble is related to converting/comparing datetime[D] to datetime[ns]. In this test, the input is datetime[D], but it's implicitly converted to datetime[ns] when it is loaded into a Categorical.

One simple hack to make this test pass is to change datetime[D] to datetime[ns]. That doesn't seem appropriate, though.

Anyway, to get an idea of what is actually going wrong, here's what happens when I try the test's first three lines in my terminal. Note that the first item becomes NaT for some reason.

In [190]: cat_array = np.array([1, 2, 3, 4, 5], dtype=np.dtype(dtype))

In [191]: input1 = np.array([1, 2, 3, 3], dtype=np.dtype(dtype))

In [192]: tc1 = Series(Categorical(input1, categories=cat_array, ordered=False))

In [193]: tc1
Out[193]:
0          NaT
1   1970-01-03
2   1970-01-04
3   1970-01-04
dtype: category
Categories (5, datetime64[ns]): [1970-01-02, 1970-01-03, 1970-01-04, 1970-01-05, 1970-01-06]