repr of MultiIndexed frame with NaT as the first element of level 0 of the index raises during repr · Issue #7406 · pandas-dev/pandas (original) (raw)
world's longest PR title, for a very strange bug
In [145]: idx = pd.to_datetime([pd.NaT] + pd.date_range('20130101', periods=2).tolist())
In [146]: idx
Out[146]:
<class 'pandas.tseries.index.DatetimeIndex'>
[NaT, ..., 2013-01-02]
Length: 3, Freq: None, Timezone: None
In [149]: df = DataFrame({'X': range(len(idx))}, index=[idx, tm.choice(list('ab'), size=len(idx))])
In [150]: df
Out[150]: <repr(<pandas.core.frame.DataFrame at 0x7ff147824950>) failed: ValueError: boolean index array has too many values>
but strangely enough if I construct it with slightly different dates it works fine:
In [184]: idx = pd.to_datetime([pd.NaT, pd.Timestamp('2013-01-01'), pd.Timestamp('2013-01-02')])
In [185]: df = DataFrame({'X': range(len(idx))}, index=[idx, tm.choice(list('ab'), size=len(idx))])
In [186]: df
Out[186]: <repr(<pandas.core.frame.DataFrame at 0x7ff14792d1d0>) failed: ValueError: boolean index array has too many values>
In [187]: idx = pd.to_datetime([pd.NaT, pd.Timestamp('2013-01-03'), pd.Timestamp('2013-01-02')])
In [188]: df = DataFrame({'X': range(len(idx))}, index=[idx, tm.choice(list('ab'), size=len(idx))])
In [189]: df
Out[189]:
X
NaN a 0
2013-01-03 b 1
2013-01-02 a 2