get_dummies with NaN · Issue #4446 · pandas-dev/pandas (original) (raw)
get_dummies seems to get caught out by NaNs
In [11]: s1 = pd.Series(['a', 'a', np.nan, 'c', 'c', 'c'])
In [12]: s1
Out[12]:
0 a
1 a
2 NaN
3 c
4 c
5 c
dtype: object
In [13]: pd.get_dummies(s1)
Out[13]:
a c
0 1 0
1 1 0
2 0 1
3 0 1
4 0 1
5 0 1
A rogue c has been used as the NaN value, I think expected is:
In [14]: pd.get_dummies(s1[s1.notnull()])
Out[14]:
a c
0 1 0
1 1 0
3 0 1
4 0 1
5 0 1