pd.to_numeric produces misleading results on DataFrame · Issue #11776 · pandas-dev/pandas (original) (raw)
when pd.to_numeric
is called with errors='coerce'
on a DataFrame
, it doesn't raise and just returns the original DataFrame
.
This may be related to the discussion here #11221 as this function currently doesn't support anything more than 1-d.
In [1]: import pandas as pd
In [2]: df = pd.DataFrame({'a': [1, 2, 'foo'], 'b': [2.3, -1, 'bar']})
In [3]: df
Out[3]:
a b
0 1 2.3
1 2 -1
2 foo bar
In [4]: pd.to_numeric(df)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-4-9febd95a7c0a> in <module>()
----> 1 pd.to_numeric(df)
/Users/mortada_mehyar/code/github/pandas/pandas/tools/util.py in to_numeric(arg, errors)
94 conv = lib.maybe_convert_numeric(arg,
95 set(),
---> 96 coerce_numeric=coerce_numeric)
97 except:
98 if errors == 'raise':
/Users/mortada_mehyar/code/github/pandas/pandas/src/inference.pyx in pandas.lib.maybe_convert_numeric (pandas/lib.c:52369)()
518 cdef int64_t iINT64_MIN = <int64_t> INT64_MIN
519
--> 520 def maybe_convert_numeric(object[:] values, set na_values,
521 bint convert_empty=True, bint coerce_numeric=False):
522 '''
ValueError: Buffer has wrong number of dimensions (expected 1, got 2)
In [5]: pd.to_numeric(df, errors='coerce')
Out[5]:
a b
0 1 2.3
1 2 -1
2 foo bar
Note that the last expression doesn't raise but the previous one does.
Seems like we should either
- make
pd.to_numeric
work withDataFrame
orNDFrame
in general - simply raise here too if a
DataFrame
or something more than 1-d is passed