pandas.DataFrame.asof — pandas 0.24.0rc1 documentation (original) (raw)

DataFrame. asof(where, subset=None)[source]

Return the last row(s) without any NaNs before where.

The last row (for each element in where, if list) without any NaN is taken. In case of a DataFrame, the last row without NaN considering only the subset of columns (if not None)

New in version 0.19.0: For DataFrame

If there is no good value, NaN is returned for a Series or a Series of NaN values for a DataFrame

Parameters: where : date or array-like of dates Date(s) before which the last row(s) are returned. subset : str or array-like of str, default None For DataFrame, if not None, only use these columns to check for NaNs.
Returns: scalar, Series, or DataFrame scalar : when self is a Series and where is a scalar Series: when self is a Series and where is an array-like, or when self is a DataFrame and where is a scalar DataFrame : when self is a DataFrame and where is an array-like

See also

merge_asof

Perform an asof merge. Similar to left join.

Notes

Dates are assumed to be sorted. Raises if this is not the case.

Examples

A Series and a scalar where.

s = pd.Series([1, 2, np.nan, 4], index=[10, 20, 30, 40]) s 10 1.0 20 2.0 30 NaN 40 4.0 dtype: float64

For a sequence where, a Series is returned. The first value is NaN, because the first element of where is before the first index value.

s.asof([5, 20]) 5 NaN 20 2.0 dtype: float64

Missing values are not considered. The following is 2.0, not NaN, even though NaN is at the index location for 30.

Take all columns into consideration

df = pd.DataFrame({'a': [10, 20, 30, 40, 50], ... 'b': [None, None, None, None, 500]}, ... index=pd.DatetimeIndex(['2018-02-27 09:01:00', ... '2018-02-27 09:02:00', ... '2018-02-27 09:03:00', ... '2018-02-27 09:04:00', ... '2018-02-27 09:05:00'])) df.asof(pd.DatetimeIndex(['2018-02-27 09:03:30', ... '2018-02-27 09:04:30'])) a b 2018-02-27 09:03:30 NaN NaN 2018-02-27 09:04:30 NaN NaN

Take a single column into consideration

df.asof(pd.DatetimeIndex(['2018-02-27 09:03:30', ... '2018-02-27 09:04:30']), ... subset=['a']) a b 2018-02-27 09:03:30 30.0 NaN 2018-02-27 09:04:30 40.0 NaN