BUG: DataFrame.unstack() with two levels containing NaN · Issue #9497 · pandas-dev/pandas (original) (raw)
With latest master code (including #9061 and #9292), DataFrame.unstack()
produces incorrect results when level
contains two levels one of which contains NaN
:
df = pd.DataFrame({'A': list('aaaabbbb'),'B':range(8), 'C':range(8)})
df.iloc[3, 1] = np.NaN
dfs = df.set_index(['A', 'B'])
left = dfs.unstack([0, 1)
data = [3, 0, 1, 2, 4, 5, 6, 7]
idx = MultiIndex(levels=[['C'], ['a', 'b'], [0, 1, 2, 4, 5, 6, 7]],
labels=[[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 1, 1, 1, 1],
[-1, 0, 1, 2, 3, 4, 5, 6]],
names=[None, 'A', 'B'])
right = Series(data, index=idx, dtype=dfs.dtypes[0])
print("dfs=")
print(dfs)
print("\nleft=")
print(left)
print("\nright=")
print(right)
assert_series_equal(left, right)
======================================================================
FAIL: test_unstack_nan_index (pandas.tests.test_frame.TestDataFrame)
----------------------------------------------------------------------
Traceback (most recent call last):
File "c:\Users\seth\github\pandas2\build\lib.win-amd64-3.4\pandas\tests\test_frame.py", line 12439, in test_unstack_nan_index
assert_series_equal(left, right)
File "c:\Users\seth\github\pandas2\build\lib.win-amd64-3.4\pandas\util\testing.py", line 674, in assert_series_equal
assert_almost_equal(left.values, right.values, check_less_precise)
File "das\src\testing.pyx", line 58, in pandas._testing.assert_almost_equal (pandas\src\testing.c:2745)
File "das\src\testing.pyx", line 93, in pandas._testing.assert_almost_equal (pandas\src\testing.c:1830)
File "das\src\testing.pyx", line 135, in pandas._testing.assert_almost_equal (pandas\src\testing.c:2514)
nose.proxy.AssertionError: (very low values) expected 3.00000 but got 0.00000, with decimal 5
-------------------- >> begin captured stdout << ---------------------
dfs=
C
A B
a 0 0
1 1
2 2
NaN 3
b 4 4
5 5
6 6
7 7
left=
A B
C a 1 0
2 1
4 2
0 3
b 6 4
7 5
a 0 6
1 7
dtype: int32
right=
A B
C a NaN 3
0 0
1 1
2 2
b 4 4
5 5
6 6
7 7
dtype: int32
--------------------- >> end captured stdout << ----------------------