BUG: Select an all-zero sparse series by indexer results in NaN (original) (raw)


Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.

Code Sample, a copy-pastable example

import pandas as pd import scipy as sp import scipy.sparse

spmatrix = sp.sparse.csr_matrix((2, 2)) spmatrix[0, 0] = 1

df = pd.DataFrame.sparse.from_spmatrix(spmatrix)

work as expected, cool

df.loc[1]

0 0.0

1 0.0

Name: 0, dtype: Sparse[float64, 0.0]

Indexing by index list. I expect 1,0, 0.0, not cool

df.loc[[1]]

0 1

1 0.0 NaN

whenever indexing an all-zero series, it returns Nan.

df.loc[[1]].loc[[1]]

0 1

1 NaN NaN

dtype is good

df.loc[[1]].dtypes

0 Sparse[float64, 0.0]

1 Sparse[float64, 0.0]

dtype: object

indexing by bool list also not cool

df.loc[[True, True]]

0 1

0 1.0 NaN

1 0.0 NaN

Slicing is cool

df.loc[1:2]

0 1

1 0.0 0.0

Problem description

Index slice an all-zero sparse series by indexer (whether by bool array or index array) results in NaN. It should be the fill_value of the series.

Expected Output

Output of pd.show_versions()

INSTALLED VERSIONS
------------------
commit           : None
python           : 3.8.1.final.0
python-bits      : 64
OS               : Linux
OS-release       : 4.15.0-101-generic
machine          : x86_64
processor        : x86_64
byteorder        : little
LC_ALL           : en_US.UTF-8
LANG             : en_US.UTF-8
LOCALE           : en_US.UTF-8

pandas           : 1.0.3
numpy            : 1.18.1
pytz             : 2019.3
dateutil         : 2.8.1
pip              : 20.0.2
setuptools       : 45.2.0
Cython           : None
pytest           : 5.3.5
hypothesis       : None
sphinx           : None
blosc            : None
feather          : None
xlsxwriter       : None
lxml.etree       : None
html5lib         : None
pymysql          : None
psycopg2         : None
jinja2           : 2.11.1
IPython          : 7.12.0
pandas_datareader: None
bs4              : None
bottleneck       : None
fastparquet      : None
gcsfs            : None
lxml.etree       : None
matplotlib       : 3.2.0
numexpr          : None
odfpy            : None
openpyxl         : None
pandas_gbq       : None
pyarrow          : None
pytables         : None
pytest           : 5.3.5
pyxlsb           : None
s3fs             : None
scipy            : 1.4.1
sqlalchemy       : None
tables           : None
tabulate         : None
xarray           : None
xlrd             : None
xlwt             : None
xlsxwriter       : None
numba            : None