API/ERR/ENH: Allow MultiIndex.from_tuples to handle NaNs · Issue #23578 · pandas-dev/pandas (original) (raw)
This is the origin of #23558 and #23677, but IMO worthy to fix in its own right:
Trying to create a MultiIndex from a list of tuples containing NaNs (like Index.str.partition
does if there are NaNs present) yields:
>>> pd.MultiIndex.from_tuples([('a', 'b', 'c'), np.nan, ('d', '', '')])
[...]
TypeError: object of type 'float' has no len()
However, it works easily when passing a tuple of NaNs
>>> pd.MultiIndex.from_tuples([('a', 'b', 'c'), (np.nan,) * 3, ('d', '', '')])
MultiIndex(levels=[['a', 'd'], ['', 'b'], ['', 'c']],
labels=[[0, -1, 1], [1, -1, 0], [1, -1, 0]])
In fact, the length of the tuple is irrelevant:
>>> pd.MultiIndex.from_tuples([('a', 'b', 'c'), (np.nan,), ('d', '', '')])
MultiIndex(levels=[['a', 'd'], ['', 'b'], ['', 'c']],
labels=[[0, -1, 1], [1, -1, 0], [1, -1, 0]])
Aside from the inconvenience, it is also a real problem to set elements of an array (say if there are several NaNs) to tuples, because they almost always get interpreted as another axis/list-like and then give creative errors.
I'm thinking this extra fail-safe should not be controversial, and I've got a fix prepared already, which is blocked by #23582 (at least in terms of adding tests).