repr of MultiIndexed frame with NaT as the first element of level 0 of the index raises during repr · Issue #7406 · pandas-dev/pandas (original) (raw)

world's longest PR title, for a very strange bug

In [145]: idx = pd.to_datetime([pd.NaT] + pd.date_range('20130101', periods=2).tolist())

In [146]: idx
Out[146]:
<class 'pandas.tseries.index.DatetimeIndex'>
[NaT, ..., 2013-01-02]
Length: 3, Freq: None, Timezone: None

In [149]: df = DataFrame({'X': range(len(idx))}, index=[idx, tm.choice(list('ab'), size=len(idx))])

In [150]: df
Out[150]: <repr(<pandas.core.frame.DataFrame at 0x7ff147824950>) failed: ValueError: boolean index array has too many values>

but strangely enough if I construct it with slightly different dates it works fine:

In [184]: idx = pd.to_datetime([pd.NaT, pd.Timestamp('2013-01-01'), pd.Timestamp('2013-01-02')])

In [185]: df = DataFrame({'X': range(len(idx))}, index=[idx, tm.choice(list('ab'), size=len(idx))])

In [186]: df
Out[186]: <repr(<pandas.core.frame.DataFrame at 0x7ff14792d1d0>) failed: ValueError: boolean index array has too many values>

In [187]: idx = pd.to_datetime([pd.NaT, pd.Timestamp('2013-01-03'), pd.Timestamp('2013-01-02')])

In [188]: df = DataFrame({'X': range(len(idx))}, index=[idx, tm.choice(list('ab'), size=len(idx))])

In [189]: df
Out[189]:
              X
NaN        a  0
2013-01-03 b  1
2013-01-02 a  2