BUG: ordered categorical comparison with missing values evaluates to True · Issue #26504 · pandas-dev/pandas (original) (raw)
Code Sample, a copy-pastable example if possible
import pandas as pd pd.Categorical(["1", "2", "3", None], categories=["1", "2", "3"], ordered=True) <= "2"
=> array([ True, True, False, True])
Problem description
Here a missing entry is being evaluated as None <= "2" == True
. Shouldn't missing values always be evaluate to False
in any comparison?
I think this is related to #4537
Expected Output
import pandas as pd pd.Categorical(["1", "2", "3", None], categories=["1", "2", "3"], ordered=True) <= "2"
=> array([ True, True, False, False])
Output of pd.show_versions()
INSTALLED VERSIONS ------------------ commit: None python: 3.6.8.final.0 python-bits: 64 OS: Darwin OS-release: 17.5.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_GB.UTF-8 LOCALE: en_GB.UTF-8
pandas: 0.24.2
pytest: 3.6.2
pip: 10.0.1
setuptools: 39.2.0
Cython: 0.28.3
numpy: 1.16.3
scipy: 1.2.1
pyarrow: None
xarray: None
IPython: 7.1.1
sphinx: 1.7.5
patsy: 0.5.0
dateutil: 2.7.3
pytz: 2018.4
blosc: None
bottleneck: 1.2.1
tables: 3.4.4
numexpr: 2.6.9
feather: None
matplotlib: 3.0.3
openpyxl: 2.5.4
xlrd: 1.1.0
xlwt: 1.2.0
xlsxwriter: 1.0.5
lxml.etree: 4.2.2
bs4: 4.6.0
html5lib: 0.9999999
sqlalchemy: 1.2.8
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None