ERR: Segfault with df.astype(category) on empty dataframe · Issue #18004 · pandas-dev/pandas (original) (raw)
Pandas segfaults when calling DataFrame.astype('category')
on an empty dataframe. This fails in 0.21.0rc1
, 0.20.3
, and probably previous versions as well.
In [1]: import pandas as pd
In [2]: df = pd.DataFrame({'x': ['a', 'b', 'c'], 'y': ['a', 'b', 'c'], 'z': ['a', 'b', 'c']})
In [3]: empty = df.iloc[:0].copy() # copy is necessary, no segfault without it
In [4]: empty.astype('category') Segmentation fault: 11
For non-empty frames, an error message is raised saying this operation isn't supported yet. Also note that a copy is needed to cause the segfault, without it the error message is still raised.
pd.show_versions:
In [5]: pd.show_versions()
INSTALLED VERSIONS
commit: None python: 3.5.4.final.0 python-bits: 64 OS: Darwin OS-release: 16.5.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8
pandas: 0.20.3 pytest: 3.1.0 pip: 9.0.1 setuptools: 36.4.0 Cython: 0.25.2 numpy: 1.13.3 scipy: 0.19.1 xarray: None IPython: 6.1.0 sphinx: 1.5.3 patsy: None dateutil: 2.6.0 pytz: 2017.2 blosc: None bottleneck: None tables: 3.4.2 numexpr: 2.6.4 feather: None matplotlib: 2.0.2 openpyxl: None xlrd: None xlwt: None xlsxwriter: None lxml: None bs4: None html5lib: None sqlalchemy: 1.1.9 pymysql: None psycopg2: None jinja2: 2.9.5 s3fs: 0.1.1 pandas_gbq: None pandas_datareader: None