TST: tests for GH4862, GH7401, GH7403, GH7405 by behzadnouri · Pull Request #9292 · pandas-dev/pandas (original) (raw)
@behzadnouri, yes, precisely, here the conversion of ints to foats is done by Index.__new__
. However since this is being compared to the results of left = df.set_index(['A', 'B']).unstack(0)
, clearly the unstack()
code is also doing such a conversion.
First, other than specifying dtype='object'
, there doesn't seem to be a way to construct a (single-level) index of _int_s with NaN
s. In a MultiIndex
with more than one level, a level can be int
with NaN
s, but a MultiIndex
with a single level gets converted automatically to an Index
, and in the case of an int
level with NaN
s, a Float64Index
.
In [27]: pd.Index([nan, 1, 2]) # note automatic conversion to Float64Index
Out[27]: Float64Index([nan, 1.0, 2.0], dtype='float64')
In [29]: pd.MultiIndex.from_tuples([(nan,), (1,), (2,)]) # note automatic conversion to Float64Index
Out[29]: Float64Index([nan, 1.0, 2.0], dtype='float64')
In [30]: pd.MultiIndex.from_tuples([(nan, 'a'), (1, 'a'), (2, 'a')])
Out[30]:
MultiIndex(levels=[[1, 2], ['a']],
labels=[[-1, 0, 1], [0, 0, 0]])
In [31]: pd.MultiIndex.from_tuples([(nan, 'a'), (1, 'a'), (2, 'a')]).levels[0]
Out[31]: Int64Index([1, 2], dtype='int64') # note first level is Int64Index
Second, my point is that in the course of stacking and/or unstacking (and perhaps other operations), a single level in a multi-level MultiIndex
can be "promoted" to be an Index
unto itself, and in that case seems (at least currently in the case of stack()
and unstack()
) to be changed from ints to floats, which I don't think is a good thing to do to index values.