assert_frame_equal not differentiating NaN and None within object dtype · Issue #18463 · pandas-dev/pandas (original) (raw)
Code Sample, a copy-pastable example if possible
df1 = pd.DataFrame([['foo', 'bar', 'baz'], [None, None, None]]) df2 = pd.DataFrame([['foo', 'bar', 'baz'], [np.nan, np.nan, np.nan]]) tm.assert_frame_equal(df1, df2)
Problem description
No AssertionError
gets raised in this case, even though I would not expect None
and np.nan
to be considered equal values (I found this while testing #18450)
Note that if you simply compared a DataFrame with only np.nan
with another containing only None
you would get an AttributeError
that the dtypes are different (float64
vs object
) but because the missing values are mixed into object dtypes here I think that differentiation gets lost
Expected Output
AssertionError: DataFrame.iloc[1, :] are different
Output of pd.show_versions()
[paste the output of pd.show_versions()
here below this line]
INSTALLED VERSIONS
commit: d64995a
python: 3.6.2.final.0
python-bits: 64
OS: Darwin
OS-release: 17.2.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
pandas: 0.22.0.dev0+205.gd64995a4f
pytest: 3.2.5
pip: 9.0.1
setuptools: 36.4.0
Cython: 0.26
numpy: 1.13.1
scipy: None
pyarrow: None
xarray: None
IPython: 6.2.1
sphinx: 1.6.3
patsy: None
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None