BUG: algorithms.factorize moves null values when sort=False by rhshadrach · Pull Request #46601 · pandas-dev/pandas (original) (raw)
In the example below, the result_index has nan moved even though sort=False. This is the order that will be in any groupby reduction result and the reason why transform currently returns wrong results.
df = pd.DataFrame({'a': [1, 3, np.nan, 1, 2], 'b': [3, 4, 5, 6, 7]})
print(df.groupby('a', sort=False, dropna=False).grouper.result_index)
# main
Float64Index([1.0, 3.0, 2.0, nan], dtype='float64', name='a')
# this PR
Float64Index([1.0, 3.0, nan, 2.0], dtype='float64', name='a')