DEPR: Change default to observed=True in DataFrame.groupby · Issue #43999 · pandas-dev/pandas (original) (raw)

The default behaviour of pandas.DataFrame.groupby is currently different depending on the type of the groupers (when one of the groupers is categorical, unobserved categories are added to the groupby by default. This behaviour can be overriden by setting the observed argument to False).

I feel like making the groupby API consistent by default and regardless of the underlying data type would provide a much better user experience.

Describe the solution you'd like

Default to observed=False in pandas.DataFrame.groupby.

API breaking implications

Would break backwards-compatibility.

Describe alternatives you've considered

So far the only option I can think of is to add observed=True to every groupby I write to make sure it will behave correctly no matter what kind of data gets passed to it.