ZeroDivisionError when groupby rank with method="dense" and pct=True · Issue #23666 · pandas-dev/pandas (original) (raw)

When I tried to use groupby rank function with method="dense", pct=True options, I encountered the ZeroDivisionError.

Code Sample, a copy-pastable example if possible

import pandas as pd

df = pd.DataFrame({"A": [1, 1, 1, 2, 2, 2], "B": [1, 1, 1, 1, 2, 2], "C": [1, 2, 1, 1, 1, 2]}) df.groupby(["A", "B"])["C"].rank(method="dense", pct=True)

error:

Traceback (most recent call last):
  File "c:/Users/<user_name>/Documents/test.py", line 6, in <module>
    df.groupby(["A", "B"])["C"].rank(method="dense", pct=True)
  File "C:\Users\<user_name>\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py", line 1906, in rank
    na_option=na_option, pct=pct, axis=axis)
  File "C:\Users\<user_name>\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py", line 1025, in _cython_transform
    **kwargs)
  File "C:\Users\<user_name>\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py", line 2630, in transform
    return self._cython_operation('transform', values, how, axis, **kwargs)
  File "C:\Users\<user_name>\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py", line 2590, in _cython_operation
    **kwargs)
  File "C:\Users\<user_name>\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py", line 2664, in _transform
    transform_func(result, values, comp_ids, is_datetimelike, **kwargs)
  File "C:\Users\<user_name>\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py", line 2479, in wrapper
    return f(afunc, *args, **kwargs)
  File "C:\Users\<user_name>\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py", line 2431, in <lambda>
    kwargs.get('na_option', 'keep')
  File "pandas\_libs\groupby_helper.pxi", line 1292, in pandas._libs.groupby.group_rank_int64
ZeroDivisionError: float division

Problem description

I encountered ZeroDivisionError when I tried to use the groupby rank function.

I can't find out exactly what a problem is. But when I drop either method="dense" or pct=True option, the above code works.

If some elements in the above DataFrame are changed, this error disappear. For example, the following code gives the expected output.

df = pd.DataFrame({"A": [1, 1, 1, 2, 2, 2], "B": [1, 1, 1, 1, 2, 2], "C": [1, 2, 1, 0, 1, 2]}) # a little change in column C df.groupby(["A", "B"])["C"].rank(method="dense", pct=True)

output:

0    0.5
1    1.0
2    0.5
3    1.0
4    0.5
5    1.0
Name: C, dtype: float64

Output of `pd.show_versions()`

[paste the output of pd.show_versions() here below this line]

INSTALLED VERSIONS

commit: None
python: 3.6.6.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 78 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None

pandas: 0.23.4
pytest: 3.5.1
pip: 10.0.1
setuptools: 39.1.0
Cython: 0.28.2
numpy: 1.14.5
scipy: 1.1.0
pyarrow: None
xarray: None
IPython: 6.4.0
sphinx: 1.7.4
patsy: 0.5.0
dateutil: 2.7.3
pytz: 2018.4
blosc: None
bottleneck: 1.2.1
tables: 3.4.3
numexpr: 2.6.5
feather: None
matplotlib: 3.0.0
openpyxl: 2.5.3
xlrd: 1.1.0
xlwt: 1.3.0
xlsxwriter: 1.0.4
lxml: 4.2.5
bs4: 4.6.0
html5lib: 1.0.1
sqlalchemy: 1.2.7
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

ZeroDivisionError when groupby rank with method="dense" and pct=True · Issue #23666 · pandas-dev/pandas (original) (raw)

Code Sample, a copy-pastable example if possible

Problem description

Output of pd.show_versions()

INSTALLED VERSIONS

Output of `pd.show_versions()`