COMPAT: unique() should preserve the dtype of the input by stuarteberg · Pull Request #27874 · pandas-dev/pandas (original) (raw)
The referenced issue seems to imply that the trouble is related to converting/comparing datetime[D]
to datetime[ns]
. In this test, the input is datetime[D]
, but it's implicitly converted to datetime[ns]
when it is loaded into a Categorical
.
One simple hack to make this test pass is to change datetime[D]
to datetime[ns]
. That doesn't seem appropriate, though.
Anyway, to get an idea of what is actually going wrong, here's what happens when I try the test's first three lines in my terminal. Note that the first item becomes NaT
for some reason.
In [190]: cat_array = np.array([1, 2, 3, 4, 5], dtype=np.dtype(dtype))
In [191]: input1 = np.array([1, 2, 3, 3], dtype=np.dtype(dtype))
In [192]: tc1 = Series(Categorical(input1, categories=cat_array, ordered=False))
In [193]: tc1
Out[193]:
0 NaT
1 1970-01-03
2 1970-01-04
3 1970-01-04
dtype: category
Categories (5, datetime64[ns]): [1970-01-02, 1970-01-03, 1970-01-04, 1970-01-05, 1970-01-06]