BUG: Series.fillna() crashes on Categorical series if value is a series · Issue #17033 · pandas-dev/pandas (original) (raw)
Code Sample
import pandas as pd, numpy as np s_str = pd.Series(['hello',np.NaN]) print s_str.fillna(s_str) # This works s_cat = s_str.astype('category') print s_cat.fillna(s_str) # This crashes
Problem description
Pandas.Series.fillna can take a scalar, dict, Series or DataFrame as value. The fillna() method for categorical only takes scalars as value, but it doesn't provide a clear error message when an unsupported input type, such as Series, is provided.
Calling Categorical.fillna() with value=series crashes with a cryptic error message:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Expected Output
If Series cannot be supported as input, the function should check for input type and provide a proper ValueError message (e.g. "value must be a scalar").
The ideal solution, however, would be to have Categorical.fillna() support the same value types as other fillna() methods.
Output of pd.show_versions()
INSTALLED VERSIONS ------------------ commit: None python: 2.7.11.final.0 python-bits: 64 OS: Darwin OS-release: 14.5.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_GB.UTF-8 LOCALE: en_GB.UTF-8
pandas: 0.20.1
pytest: None
pip: 9.0.1
setuptools: 30.2.0
Cython: 0.23.4
numpy: 1.12.1
scipy: 0.19.0
xarray: None
IPython: 5.1.0
sphinx: 1.5.3
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2017.2
blosc: None
bottleneck: 1.0.0
tables: None
numexpr: 2.5
feather: None
matplotlib: 1.5.1
openpyxl: 2.4.1
xlrd: 0.9.4
xlwt: None
xlsxwriter: None
lxml: 3.4.4
bs4: None
html5lib: None
sqlalchemy: 1.1.2
pymysql: None
psycopg2: 2.6.1 (dt dec pq3 ext lo64)
jinja2: 2.9.5
s3fs: None
pandas_gbq: None
pandas_datareader: None