subset keyword argument will include last column if incorrect keys are given · Issue #8303 · pandas-dev/pandas (original) (raw)
If you give a non-existent key to the subset argument it will default to the last column. For example:
In [3]: d = pd.DataFrame({'a':[1],'b':[2],'c':[np.nan]})
In [4]: len(d.dropna(subset=['x']))
Out[4]: 0
It seems the problem is in the line where dropna takes the subset:
agg_obj = self.take(ax.get_indexer_for(subset),axis=agg_axis)
Here, ax.get_indexer_for(subset)
will return -1
as a sentinel value for any keys that were not found in subset
, and self.take
interprets the -1
as a request for the last column.