BUG: Qcut interval not selecting the correct inclusive and exclusive limits · Issue #59355 · pandas-dev/pandas (original) (raw)

Pandas version checks

Reproducible Example

pd.version '2.2.2'

cat = pd.qcut([0, 1, 2, 3], 3, precision=0)

cat [(-1.0, 1.0], (1.0, 2.0], (2.0, 3.0], (2.0, 3.0]] Categories (3, interval[float64, right]): [(-1.0, 1.0] < (1.0, 2.0] < (2.0, 3.0]]

while the expected output should be

cat
[(-1.0, 1.0], (-1.0, 1.0], (1.0, 2.0], (2.0, 3.0]]
Categories (3, interval[float64, right]): [(-1.0, 1.0] < (1.0, 2.0] < (2.0, 3.0]]


### Issue Description

The second element with value 1 should be allocated to the interval (-1.0, 1.0] instead of the erroneous allocation to the interval (1.0, 2.0]

### Expected Behavior

The expected behaviour should be

cat
[(-1.0, 1.0], (-1.0, 1.0], (1.0, 2.0], (2.0, 3.0]]
Categories (3, interval[float64, right]): [(-1.0, 1.0] < (1.0, 2.0] < (2.0, 3.0]]


### Installed Versions

<details>

INSTALLED VERSIONS
------------------
commit                : d9cdd2ee5a58015ef6f4d15c7226110c9aab8140
python                : 3.11.4.final.0
python-bits           : 64
OS                    : Darwin
OS-release            : 21.6.0
Version               : Darwin Kernel Version 21.6.0: Wed Apr 24 06:02:02 PDT 2024; root:xnu-8020.240.18.708.4~1/RELEASE_X86_64
machine               : x86_64
processor             : i386
byteorder             : little
LC_ALL                : None
LANG                  : None
LOCALE                : en_GB.UTF-8
pandas                : 2.2.2
numpy                 : 1.26.0
pytz                  : 2023.3.post1
dateutil              : 2.8.2
setuptools            : 68.2.0
pip                   : 23.2.1
Cython                : None
pytest                : None
hypothesis            : None
sphinx                : None
blosc                 : None
feather               : None
xlsxwriter            : None
lxml.etree            : None
html5lib              : None
pymysql               : None
psycopg2              : None
jinja2                : 3.1.2
IPython               : None
pandas_datareader     : None
adbc-driver-postgresql: None
adbc-driver-sqlite    : None
bs4                   : None
bottleneck            : None
dataframe-api-compat  : None
fastparquet           : None
fsspec                : 2023.9.2
gcsfs                 : None
matplotlib            : None
numba                 : None
numexpr               : None
odfpy                 : None
openpyxl              : None
pandas_gbq            : None
pyarrow               : None
pyreadstat            : None
python-calamine       : None
pyxlsb                : None
s3fs                  : None
scipy                 : 1.11.3
sqlalchemy            : None
tables                : None
tabulate              : None
xarray                : None
xlrd                  : None
zstandard             : None
tzdata                : 2023.3
qtpy                  : None
pyqt5                 : None
</details>