select_dtypes and get_dummies break on duplicate dolumns · Issue #20848 · pandas-dev/pandas (original) (raw)
Functions select_dtypes
and get_dummies
have strange and incorrect behavior on duplicate column names. Shown below:
In [6]: df Out[6]: col1 col1 0 1 a 1 2 b
In [7]: df.select_dtypes(include=['int']) Out[7]: Empty DataFrame Columns: [] Index: [0, 1]
In [8]: pd.get_dummies(df) Out[8]: col1_('c', 'o', 'l', '1') col1_('c', 'o', 'l', '1') 0 1 1 1 1 1