qcut raising TypeError for boolean Series · Issue #20303 · pandas-dev/pandas (original) (raw)
Code Sample, a copy-pastable example if possible
import pandas as pd pd.qcut(pd.Series([True, False, False, False, False, False, True]), 6, duplicates="drop", precision=2)
Problem description
Pandas throws a TypeError:
Traceback (most recent call last):
File "/tmp/pandas/env/lib/python3.5/site-packages/numpy/core/fromnumeric.py", line 52, in _wrapfunc
return getattr(obj, method)(*args, **kwds)
TypeError: Cannot cast ufunc multiply output from dtype('float64') to dtype('bool') with casting rule 'same_kind'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/tmp/pandas/env/src/pandas/pandas/core/reshape/tile.py", line 210, in qcut
dtype=dtype, duplicates=duplicates)
File "/tmp/pandas/env/src/pandas/pandas/core/reshape/tile.py", line 254, in _bins_to_cuts
dtype=dtype)
File "/tmp/pandas/env/src/pandas/pandas/core/reshape/tile.py", line 351, in _format_labels
precision = _infer_precision(precision, bins)
File "/tmp/pandas/env/src/pandas/pandas/core/reshape/tile.py", line 429, in _infer_precision
levels = [_round_frac(b, precision) for b in bins]
File "/tmp/pandas/env/src/pandas/pandas/core/reshape/tile.py", line 429, in <listcomp>
levels = [_round_frac(b, precision) for b in bins]
File "/tmp/pandas/env/src/pandas/pandas/core/reshape/tile.py", line 422, in _round_frac
return np.around(x, digits)
File "/tmp/pandas/env/lib/python3.5/site-packages/numpy/core/fromnumeric.py", line 2837, in around
return _wrapfunc(a, 'round', decimals=decimals, out=out)
File "/tmp/pandas/env/lib/python3.5/site-packages/numpy/core/fromnumeric.py", line 62, in _wrapfunc
return _wrapit(obj, method, *args, **kwds)
File "/tmp/pandas/env/lib/python3.5/site-packages/numpy/core/fromnumeric.py", line 42, in _wrapit
result = getattr(asarray(obj), method)(*args, **kwds)
TypeError: Cannot cast ufunc multiply output from dtype('float64') to dtype('bool') with casting rule 'same_kind'
If the second parameter for qcut is changed from 6 to 7, a different TypeError is raised:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/tmp/pandas/env/src/pandas/pandas/core/reshape/tile.py", line 207, in qcut
bins = algos.quantile(x, quantiles)
File "/tmp/pandas/env/src/pandas/pandas/core/algorithms.py", line 903, in quantile
return algos.arrmap_float64(q, _get_score)
File "pandas/_libs/algos_common_helper.pxi", line 416, in pandas._libs.algos.arrmap_float64
File "/tmp/pandas/env/src/pandas/pandas/core/algorithms.py", line 888, in _get_score
idx % 1)
File "/tmp/pandas/env/src/pandas/pandas/core/algorithms.py", line 876, in _interpolate
return a + (b - a) * fraction
TypeError: numpy boolean subtract, the `-` operator, is deprecated, use the bitwise_xor, the `^` operator, or the logical_xor function instead.
Expected Output
Something like
0 (0.29, 1.0]
1 (-0.01, 0.29]
2 (-0.01, 0.29]
3 (-0.01, 0.29]
4 (-0.01, 0.29]
5 (-0.01, 0.29]
6 (0.29, 1.0]
dtype: category
Categories (2, interval[float64]): [(-0.01, 0.29] < (0.29, 1.0]]
Output of pd.show_versions()
Using Pandas 0.23.0.dev0+516.g74e6c78, also reproducable with 0.22.0. In Pandas 0.20.3, the first TypeError is also reproducable, but the second command (with 7 instead of 6) works.
INSTALLED VERSIONS
commit: None
python: 3.5.2.final.0
python-bits: 64
OS: Linux
OS-release: 4.13.0-36-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: de_DE.UTF-8
LOCALE: de_DE.UTF-8
pandas: 0.23.0.dev0+516.g74e6c78
pytest: None
pip: 9.0.1
setuptools: 38.5.2
Cython: 0.27.3
numpy: 1.14.1
scipy: None
pyarrow: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.7.0
pytz: 2018.3
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None