TST: Fix test for datetime categorical by sinhrks · Pull Request #10501 · pandas-dev/pandas (original) (raw)
Related to #10465, but different part.
Current test_groupby_datetime_categorical
in test_groupby.py
is incorrect, the actual result returns CategoricalIndex
as level 0, otherwise expected result uses DatetimeIndex
as level 0. Changed to use the same dtype and added explicit comparison.
Actual Result (current test case)
levels = pd.date_range('2014-01-01', periods=4)
codes = np.random.randint(0, 4, size=100)
cats = pd.Categorical.from_codes(codes, levels, name='myfactor', ordered=True)
data = pd.DataFrame(np.random.randn(100, 4))
grouped = data.groupby(cats)
desc_result = grouped.describe()
desc_result.index.get_level_values(0)
# CategoricalIndex([2014-01-01T09:00:00.000000000+0900,
# ...
# 2014-01-04T09:00:00.000000000+0900],
# categories=[2014-01-01 00:00:00, 2014-01-02 00:00:00, 2014-01-03 00:00:00, 2014-01-04 00:00:00],
# ordered=True, name=u'myfactor', dtype='category')
Expected Result (current test case)
It must be CategoricalIndex
.
idx = cats.codes.argsort()
ord_labels = np.asarray(cats).take(idx)
ord_data = data.take(idx)
expected = ord_data.groupby(ord_labels, sort=False).describe()
expected.index.names = ['myfactor', None]
expected.index.get_level_values(0)
# DatetimeIndex(['2014-01-01', '2014-01-01', '2014-01-01', '2014-01-01',
# ...
# '2014-01-04', '2014-01-04', '2014-01-04', '2014-01-04'],
# dtype='datetime64[ns]', name=u'myfactor', freq=None, tz=None)