BUG: GroupBy.get_group doesnt work with TimeGrouper by sinhrks · Pull Request #6914 · pandas-dev/pandas (original) (raw)

I think using _get_indices_dict is not easy, because BinGrouper doesn't have information which can be passed to the function as it is.

>>> import pandas as pd
>>> import datetime
>>> from pandas.core.groupby import GroupBy, _get_indices_dict

>>> df = pd.DataFrame({'Branch' : 'A A A A A A A B'.split(),
                   'Buyer': 'Carl Mark Carl Carl Joe Joe Joe Carl'.split(),
                   'Quantity': [1,3,5,1,8,1,9,3],
                   'Date' : [
                    datetime.datetime(2013,1,1,13,0), datetime.datetime(2013,1,1,13,5),
                    datetime.datetime(2013,10,1,20,0), datetime.datetime(2013,10,2,10,0),
                    datetime.datetime(2013,10,1,20,0), datetime.datetime(2013,10,2,10,0),
                    datetime.datetime(2013,12,2,12,0), datetime.datetime(2013,12,2,14,0),]})

>>> grouped = df.groupby(pd.Grouper(freq='1M',key='Date'))
>>> grouped.grouper.bins
[2 2 2 2 2 2 2 2 2 6 6 8]
>>> grouped.grouper.binlabels
<class 'pandas.tseries.index.DatetimeIndex'>
[2013-01-31, ..., 2013-12-31]
Length: 12, Freq: M, Timezone: None

Thus, I have to convert it using the similar logic as current implementation.

>>> indices = []
>>> i = 0
>>> for j, bin in enumerate(grouped.grouper.bins):
>>>     if i < bin:
>>>         indices.extend([j] * (bin - i))
>>>         i = bin
>>> indices = np.array(indices)
>>> indices
[ 0  0  9  9  9  9 11 11]

And _get_indices_dict returns keys as tuple, further conversion required.

>>> _get_indices_dict([indices], [grouped.grouper.binlabels])
{(numpy.datetime64('2013-10-31T09:00:00.000000000+0900'),): array([2, 3, 4, 5]), (numpy.datetime64('2013-12-31T09:00:00.000000000+0900'),): array([6, 7]), (numpy.datetime64('2013-01-31T09:00:00.000000000+0900'),): array([0, 1])}