BUG: incorrect rounding in groupby.cummin near int64 implementation bounds by jbrockmendel · Pull Request #40767 · pandas-dev/pandas (original) (raw)

I think these iNaT checks are necessary because in algos like group_min_max, it assumes that it is impossible for an integer array to have iNaT (because it assumes datetimelike=True and safety in setting missing values as NPY_NAT).

eg on this branch:

ser = pd.Series([1, iNaT])
print(ser.groupby([1, 1]).max(min_count=2))

gives

1   -9223372036854775808
dtype: int64

because the iNaT is interpreted as NaN, so min_count isn't reached. (Then the NaN is set with iNaT, but not interpreted as NaN anymore)

Forcing mask usage (#40651 (comment)) would be another way to handle this more robustly (since to specify missing values, the mask itself can just be modified inplace, precluding the need for the existing iNaT workarounds to signify missing values in the algo result)