Rethinking to_flat_index on flat Index · Issue #23670 · pandas-dev/pandas (original) (raw)

With reference to #22866, and to a comment I made ( #22866 (comment) ) which I thought was paranoid and maybe is not.

Three considerations:

  1. to_flat_index is the only method (the only operation in pandas, actually) I can think of which gives a different result when called on a flat Index and when called on an equivalent 1-level MultiIndex: you obtain the Index itself (each item being assumingly a scalar) in the first case and an Index of length 1 tuples in the second case
  2. if one needs, for any reason, an Index of length-one scalar, there is no simple way to obtain it. Or if you prefer: if we add some sep or analogous arguments (e.g. fmt) to to_flat_index, then it is not going to work on a flat Index (at least with the current implementation). In general, to_flat_index is a method which makes sense also because it can be combined with, for instance .map. When doing so, it is good to know for sure that the arguments received by the callable are always tuples.
  3. Index.to_flat_index() as it is now is idempotent and pretty useless: it makes sense for compatibility... but then compatibility is a priority!

These all lead me to conclude: shouldn't pd.Index.to_flat_index() return an Index of length-1 tuples?

@WillAyd

Code Sample, a copy-pastable example if possible

In [2]: pd.Index([1, 2, 3,]).to_flat_index() Out[2]: Int64Index([1, 2, 3], dtype='int64')

In [3]: pd.MultiIndex.from_arrays([pd.Index([1, 2, 3,])]).to_flat_index() Out[3]: Index([(1,), (2,), (3,)], dtype='object')

Expected Output

Out[2] should be equal to Out[3]

Output of pd.show_versions()

INSTALLED VERSIONS

commit: 454ecfc
python: 3.5.3.final.0
python-bits: 64
OS: Linux
OS-release: 4.9.0-8-amd64
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: it_IT.UTF-8
LOCALE: it_IT.UTF-8

pandas: 0.24.0.dev0+995.g454ecfc61
pytest: 3.5.0
pip: 9.0.1
setuptools: 39.2.0
Cython: 0.28.4
numpy: 1.14.3
scipy: 0.19.0
pyarrow: None
xarray: None
IPython: 6.2.1
sphinx: 1.5.6
patsy: 0.5.0
dateutil: 2.7.3
pytz: 2018.4
blosc: None
bottleneck: 1.2.0dev
tables: 3.3.0
numexpr: 2.6.1
feather: 0.3.1
matplotlib: 2.2.2.post1634.dev0+ge8120cf6d
openpyxl: 2.3.0
xlrd: 1.0.0
xlwt: 1.3.0
xlsxwriter: 0.9.6
lxml: 4.1.1
bs4: 4.5.3
html5lib: 0.999999999
sqlalchemy: 1.0.15
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: 0.2.1
gcsfs: None