CategoricalIndex reidex duplicates values · Issue #21809 · pandas-dev/pandas (original) (raw)
From SO question:
Sample:
np.random.seed(57)
idx = pd.CategoricalIndex(['low'] * 3 + ['hi'] * 3)
dfb = pd.DataFrame(np.random.rand(6, 3), columns=list('abc'), index=idx)
print (dfb)
a b c
low 0.087350 0.230477 0.411061
low 0.310783 0.565956 0.545064
low 0.807099 0.918155 0.522091
hi 0.424687 0.071804 0.898529
hi 0.420514 0.582170 0.214154
hi 0.447486 0.467864 0.100637
Round incorrectly explode rows:
print (dfb.round(3))
a b c
low 0.087 0.230 0.411
low 0.311 0.566 0.545
low 0.807 0.918 0.522
low 0.087 0.230 0.411
low 0.311 0.566 0.545
low 0.807 0.918 0.522
low 0.087 0.230 0.411
low 0.311 0.566 0.545
low 0.807 0.918 0.522
hi 0.425 0.072 0.899
hi 0.421 0.582 0.214
hi 0.447 0.468 0.101
hi 0.425 0.072 0.899
hi 0.421 0.582 0.214
hi 0.447 0.468 0.101
hi 0.425 0.072 0.899
hi 0.421 0.582 0.214
hi 0.447 0.468 0.101
Expected output:
print (dfb.round(3))
a b c
low 0.087 0.230 0.411
low 0.311 0.566 0.545
low 0.807 0.918 0.522
hi 0.425 0.072 0.899
hi 0.421 0.582 0.214
hi 0.447 0.468 0.101
print (pd.show_versions())
INSTALLED VERSIONS
------------------
commit: None
python: 3.6.4.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 42 Stepping 7, GenuineIntel
byteorder: little
LC_ALL: None
LANG: en
LOCALE: None.None
pandas: 0.23.1
pytest: 3.3.2
pip: 9.0.1
setuptools: 39.2.0
Cython: 0.27.3
numpy: 1.14.3
scipy: 1.0.0
pyarrow: None
xarray: None
IPython: 6.2.1
sphinx: 1.6.6
patsy: 0.5.0
dateutil: 2.7.3
pytz: 2018.4
blosc: None
bottleneck: 1.2.1
tables: 3.4.2
numexpr: 2.6.4
feather: None
matplotlib: 2.2.2
openpyxl: 2.4.10
xlrd: 1.1.0
xlwt: 1.3.0
xlsxwriter: 1.0.2
lxml: 4.1.1
bs4: 4.6.0
html5lib: 1.0.1
sqlalchemy: 1.2.1
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
None