BUG: plotting with DatetimeIndex containing NaT · Issue #12405 · pandas-dev/pandas (original) (raw)
In [1]: %matplotlib
Using matplotlib backend: Qt4Agg
In [2]: df = pd.DataFrame({'date': pd.date_range('2016-01-01', periods=5), 'vals
': range(5)})
In [3]: df.loc[2, 'date'] = np.nan
In [4]: s = df.set_index('date')['vals']
In [5]: s
Out[5]:
date
2016-01-01 0
2016-01-02 1
NaT 2
2016-01-04 3
2016-01-05 4
Name: vals, dtype: int64
In [6]: ax = s.plot()
In [11]: ax.get_lines()[0].get_data()
Out[11]:
(array([datetime.datetime(2016, 1, 1, 0, 0),
datetime.datetime(2016, 1, 2, 0, 0),
datetime.datetime(2016, 1, 4, 0, 0),
datetime.datetime(2016, 1, 5, 0, 0), NaT], dtype=object),
array([0, 1, 3, 4, 2], dtype=int64))
In [44]: ax.get_lines()[0].get_xydata()
Out[44]:
array([[ 7.35964000e+05, 0.00000000e+00],
[ 7.35965000e+05, 1.00000000e+00],
[ 7.35967000e+05, 3.00000000e+00],
[ 7.35968000e+05, 4.00000000e+00],
[ 6.12411009e+05, 2.00000000e+00]])
So this gives you a plot with one of the values in the year 1677, September 22, so the minimum possible Timestamp.
There is a related issue about plt.plot(pd.NaT)
erroring (#9253), but in this case it is pandas code that gives wrong results, so we have more control over this. I find it also strange that this does not error as in the pd.NaT case, but converts the NaT to Timestamp.min (but didn't look into detail)