pivot_table drops columns for aggregate functions that return all None, even if dropna=False · Issue #22159 · pandas-dev/pandas (original) (raw)

pivot_table drops columns for aggregate functions that return all None, even if dropna=False. I'm using pandas: 0.23.3 and the full output of pd.show_versions() is posted at the end of this comment. #21969 seems to be related.

Code Sample

import pandas as pd

pd.version = '0.23.3'

create sample dataframe to be pivoted

df = pd.DataFrame({'fruit':['apple','orange','peach','apple','peach'], 'size': [1,1.5,1,2,1], 'taste': [7,8,6,6,10]})

create three aggregate functions, one which returns only None

def return_one(x): return 1

def return_sum(x): return sum(x)

def return_none(x): return None

reshape the dataframe

df2 = pd.pivot_table(df, columns='fruit', aggfunc=[return_sum,return_none,return_one], dropna=False)

df2 does not include the output

of return_none, which is unexpected

return_sum return_one

fruit apple orange peach apple orange peach

size 3.0 1.5 2.0 1.0 1.0 1.0

taste 13.0 8.0 16.0 1.0 1.0 1.0

Problem description

The dropna argument of pivot_table is True by default, meaning that the default output of pivot_table will not include "columns whose entries are all NaN".

If dropna is set to False, I would expect the resulting dataframe to include columns whose entries are all NaN (or None). This is not the case. As shown in the example, the columns are dropped even if dropna=False.

Expected Output

      return_sum             return_none              return_one             
fruit      apple orange peach      apple orange peach      apple orange peach
size         3.0    1.5   2.0       None   None  None       1.0    1.0   1.0
taste       13.0    8.0  16.0       None   None  None       1.0    1.0   1.0

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.6.5.final.0
python-bits: 64
OS: Darwin
OS-release: 17.7.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.23.3
pytest: 3.3.2
pip: 18.0
setuptools: 39.0.1
Cython: 0.28.4
numpy: 1.14.0
scipy: 1.0.0
pyarrow: None
xarray: None
IPython: 6.2.1
sphinx: 1.6.7
patsy: 0.5.0
dateutil: 2.7.3
pytz: 2018.4
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.0.2
openpyxl: None
xlrd: 1.1.0
xlwt: None
xlsxwriter: 1.0.5
lxml: None
bs4: 4.6.0
html5lib: 1.0.1
sqlalchemy: 1.2.6
pymysql: None
psycopg2: 2.7.4 (dt dec pq3 ext lo64)
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None