TST: tests for GH4862, GH7401, GH7403, GH7405 by behzadnouri · Pull Request #9292 · pandas-dev/pandas (original) (raw)

@behzadnouri, yes, precisely, here the conversion of ints to foats is done by Index.__new__. However since this is being compared to the results of left = df.set_index(['A', 'B']).unstack(0), clearly the unstack() code is also doing such a conversion.

First, other than specifying dtype='object', there doesn't seem to be a way to construct a (single-level) index of _int_s with NaNs. In a MultiIndex with more than one level, a level can be int with NaNs, but a MultiIndex with a single level gets converted automatically to an Index, and in the case of an int level with NaNs, a Float64Index.

In [27]: pd.Index([nan, 1, 2])  # note automatic conversion to Float64Index
Out[27]: Float64Index([nan, 1.0, 2.0], dtype='float64')

In [29]: pd.MultiIndex.from_tuples([(nan,), (1,), (2,)])  # note automatic conversion to Float64Index
Out[29]: Float64Index([nan, 1.0, 2.0], dtype='float64')

In [30]: pd.MultiIndex.from_tuples([(nan, 'a'), (1, 'a'), (2, 'a')])
Out[30]:
MultiIndex(levels=[[1, 2], ['a']],
           labels=[[-1, 0, 1], [0, 0, 0]])

In [31]: pd.MultiIndex.from_tuples([(nan, 'a'), (1, 'a'), (2, 'a')]).levels[0]
Out[31]: Int64Index([1, 2], dtype='int64')  # note first level is Int64Index

Second, my point is that in the course of stacking and/or unstacking (and perhaps other operations), a single level in a multi-level MultiIndex can be "promoted" to be an Index unto itself, and in that case seems (at least currently in the case of stack() and unstack()) to be changed from ints to floats, which I don't think is a good thing to do to index values.