REGR: Series repr of object Index with bools and NaN is wrong · Issue #32146 · pandas-dev/pandas (original) (raw)

Code Sample, a copy-pastable example if possible

import pandas as pd
# a series with booleans and nan
pd.Series([False,True,True,pd.NA]).value_counts(dropna=False)
True     2
False    1
True     1
dtype: int64

# check the actual index of the value_counts result
pd.Series([False,True,True,pd.NA]).value_counts(dropna=False).index
Index([True, False, nan], dtype='object')

# a similar Serie, with ints instead of nans, seems to work ok
pd.Series([0,1,1,pd.NA]).value_counts(dropna=False)
1.0    2
0.0    1
NaN    1
dtype: int64

Problem description

As shown in the example, the repr of the value_counts result is apparently wrong when booleans are in the serie. It should report nan as a possible value, instead it maps it to True.
Note that this is limited to series with boolean. A similar example with ints works ok.

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit : None python : 3.7.1.final.0 python-bits : 64 OS : Linux OS-release : 4.15.0-76-generic machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : en_US.UTF-8 LOCALE : en_US.UTF-8

pandas : 1.0.1
numpy : 1.18.1
pytz : 2019.3
dateutil : 2.8.1
pip : 20.0.2
setuptools : 45.2.0
Cython : None
pytest : 5.3.5
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : 2.8.4 (dt dec pq3 ext lo64)
jinja2 : None
IPython : 7.12.0
pandas_datareader: None
bs4 : None
bottleneck : None
fastparquet : None
gcsfs : None
lxml.etree : None
matplotlib : 3.1.3
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pytables : None
pytest : 5.3.5
pyxlsb : None
s3fs : None
scipy : 1.4.1
sqlalchemy : 1.3.13
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
xlsxwriter : None
numba : None