Series/DataFrame.rank() doesn't handle certain floats properly · Issue #8365 · pandas-dev/pandas (original) (raw)
There appears to be an issue with floats that are close together in series.rank(), pandas version 0.14. For reference this test worked in pandas 0.12.0.
Current Behavior
>>> series = pd.Series([1000.000669 , 1000.000041 , 1000.000059 , 1000.000063 , 1000.000121 , 1000.000104 , 1000.000040 , 1000.000062 , 1000.000095 , 1000.000091 , 1000.000050 , 1000.000074 , 1000.000063 , 1000.000076 , 1000.000083 , 1000.000061 , 1000.000030 , 1000.000069 , 1000.000090 , 1000.000116 , 1000.000058 , 1000.000074 , 1000.000035 , 1000.000084 , 1000.000067 , 1000.000072 , 1000.000105 , 1000.000091 , 1000.000077 , 1000.000040 , 1000.000108 , 1000.000117 , 1000.000114 , 1000.000117 , 1000.000099 , 1000.000039 , 1000.000046 , 1000.000105 , 1000.000057])
>>> series.rank()
0 39.0
1 19.5
2 19.5
3 19.5
4 19.5
5 19.5
6 19.5
7 19.5
8 19.5
9 19.5
10 19.5
11 19.5
12 19.5
13 19.5
14 19.5
15 19.5
16 19.5
17 19.5
18 19.5
19 19.5
20 19.5
21 19.5
22 19.5
23 19.5
24 19.5
25 19.5
26 19.5
27 19.5
28 19.5
29 19.5
30 19.5
31 19.5
32 19.5
33 19.5
34 19.5
35 19.5
36 19.5
37 19.5
38 19.5
dtype: float64
Expected Behavior
>>> from scipy import stats
>>> stats.rankdata(series)
array([ 39. , 6. , 11. , 14.5, 38. , 30. , 4.5, 13. , 28. ,
26.5, 8. , 19.5, 14.5, 21. , 23. , 12. , 1. , 17. ,
25. , 35. , 10. , 19.5, 2. , 24. , 16. , 18. , 31.5,
26.5, 22. , 4.5, 33. , 36.5, 34. , 36.5, 29. , 3. ,
7. , 31.5, 9. ])
System Information
>>> pd.show_versions()
INSTALLED VERSIONS
------------------
commit: None
python: 2.7.6.final.0
python-bits: 64
OS: Darwin
OS-release: 13.3.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
pandas: 0.14.1
nose: 1.3.4
Cython: 0.21
numpy: 1.8.1
scipy: 0.14.0
statsmodels: 0.5.0
IPython: None
sphinx: None
patsy: 0.3.0
scikits.timeseries: None
dateutil: 2.2
pytz: 2014.7
bottleneck: 0.8.0
tables: 3.0.0
numexpr: 2.4
matplotlib: None
openpyxl: 2.1.0
xlrd: 0.9.3
xlwt: 0.7.5
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
rpy2: None
sqlalchemy: 0.8.0
pymysql: None
psycopg2: 2.5.4 (dt dec pq3 ext)