subset keyword argument will include last column if incorrect keys are given · Issue #8303 · pandas-dev/pandas (original) (raw)

If you give a non-existent key to the subset argument it will default to the last column. For example:

In [3]: d = pd.DataFrame({'a':[1],'b':[2],'c':[np.nan]})
In [4]: len(d.dropna(subset=['x']))
Out[4]: 0

It seems the problem is in the line where dropna takes the subset:

agg_obj = self.take(ax.get_indexer_for(subset),axis=agg_axis)

Here, ax.get_indexer_for(subset) will return -1 as a sentinel value for any keys that were not found in subset, and self.take interprets the -1 as a request for the last column.