pd.Series.interpolate(method="quadratic) Error with non-numeric index column · Issue #21662 · pandas-dev/pandas (original) (raw)

I'm working with the Series.interpolate function and I noticed that a DataFrame's index column can cause some weird problems when using the quadratic method.

First example: trying to impute data with non-numeric index column crashes:

data = {'A': ["a", "b", "c", "d"], 'B': [0, 1, np.nan, 100]} df = pd.DataFrame(data=data) df.set_index("A", inplace=True) s = pd.Series(df["B"])

s.interpolate(method="quadratic")

Raises the following error: 'TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe'' '

I know Pandas uses Scipy's quadratic interpolate method, and while this error is raised inside Scipy, I believe it is because Pandas expects a numeric index column to interpolate data when using a quadratic method and sends it to Scipy's method.

The previous code runs without errors by not using the quadratic method.

The following code also runs without any errors:

data = {'A': [1, 2, 3, 4], 'B': [0, 1, np.nan, 100]} df = pd.DataFrame(data=data) df.set_index("A", inplace=True) s = pd.Series(df["B"])

s.interpolate(method="quadratic")

outputs:

A 0 0.0 1 1.0 2 18.0 4 100.0 Name: B, dtype: float64

So while it makes sense to use the index column as an indicator of how many timesteps separate two values in a series, other methods seem to simply assume 1 between each row and avoid using the index column.

Two things I can think of that could be helpful is sending a more descriptive error message when the method receives a non-numeric index column or writing this condition in the docs, as I couldn't find anything about this error and solved it by tinkering with the index column.

Let me know if any of these ideas seem appropiate to prepare a PR.

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.6.3.final.0 python-bits: 64 OS: Windows OS-release: 10 machine: AMD64 processor: Intel64 Family 6 Model 60 Stepping 3, GenuineIntel byteorder: little LC_ALL: None LANG: None LOCALE: None.None

pandas: 0.20.3
pytest: 3.2.1
pip: 10.0.1
setuptools: 36.5.0.post20170921
Cython: 0.26.1
numpy: 1.13.3
scipy: 0.19.1
xarray: None
IPython: 6.1.0
sphinx: 1.6.3
patsy: 0.4.1
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: 1.2.1
tables: 3.4.2
numexpr: 2.6.2
feather: None
matplotlib: 2.1.0
openpyxl: 2.4.8
xlrd: 1.1.0
xlwt: 1.3.0
xlsxwriter: 1.0.2
lxml: 4.1.0
bs4: 4.6.0
html5lib: 0.9999999
sqlalchemy: 1.1.13
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
pandas_gbq: None
pandas_datareader: None