BUG: Index constructor should not allow an ndarray with ndim > 2 (original) (raw)

Code Sample, a copy-pastable example if possible

On master:

In [1]: import numpy as np; import pandas as pd; pd.version Out[1]: '0.25.0.dev0+833.gad18ea35b'

In [2]: pd.Index(np.arange(8).reshape(2, 2, 2)) Out[2]: Int64Index([[[0, 1], [2, 3]], [[4, 5], [6, 7]]], dtype='int64')

If the first dimension is greater than 2 it appears to flatten but does not actually do so:

In [3]: pd.Index(np.arange(12).reshape(3, 2, 2)) Out[3]: Int64Index([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], dtype='int64')

In [4]: _.values Out[4]: array([[[ 0, 1], [ 2, 3]],

   [[ 4,  5],
    [ 6,  7]],

   [[ 8,  9],
    [10, 11]]])

Problem description

The Index constructor accepts ndarrays with ndim > 2 and will even convert them to specialized subclasses, e.g. Int64Index.

Expected Output

I'd expect the operations above to raise, or at the very least should result in an object dtype Index, though I'd prefer to raise.

xref #17246

Output of `pd.show_versions()`

Details

INSTALLED VERSIONS

commit : ad18ea3
python : 3.7.3.final.0
python-bits : 64
OS : Linux
OS-release : 4.19.14-041914-generic
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8

pandas : 0.25.0.dev0+833.gad18ea35b
numpy : 1.16.4
pytz : 2019.1
dateutil : 2.8.0
pip : 19.1.1
setuptools : 40.8.0
Cython : 0.29.10
pytest : 4.6.2
hypothesis : 4.23.6
sphinx : 1.8.5
blosc : None
feather : None
xlsxwriter : 1.1.8
lxml.etree : 4.3.3
html5lib : 1.0.1
pymysql : None
psycopg2 : None
jinja2 : 2.10.1
IPython : 7.5.0
pandas_datareader: None
bs4 : 4.7.1
bottleneck : 1.2.1
fastparquet : 0.3.0
gcsfs : None
lxml.etree : 4.3.3
matplotlib : 3.1.0
numexpr : 2.6.9
openpyxl : 2.6.2
pandas_gbq : None
pyarrow : 0.11.1
pytables : None
s3fs : 0.2.1
scipy : 1.2.1
sqlalchemy : 1.3.4
tables : 3.5.2
xarray : 0.12.1
xlrd : 1.2.0
xlwt : 1.3.0
xlsxwriter : 1.1.8

BUG: Index constructor should not allow an ndarray with ndim > 2 (original) (raw)

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Output of pd.show_versions()

INSTALLED VERSIONS

Output of `pd.show_versions()`