pandas.DataFrame.diff — pandas 3.0.0rc0+33.g1fd184de2a documentation (original) (raw)
DataFrame.diff(periods=1, axis=0)[source]#
First discrete difference of element.
Calculates the difference of a DataFrame element compared with another element in the DataFrame (default is element in previous row).
Parameters:
periodsint, default 1
Periods to shift for calculating difference, accepts negative values.
axis{0 or ‘index’, 1 or ‘columns’}, default 0
Take difference over rows (0) or columns (1).
Returns:
DataFrame
First differences of the Series.
Notes
For boolean dtypes, this uses operator.xor() rather thanoperator.sub(). The result is calculated according to current dtype in DataFrame, however dtype of the result is always float64.
Examples
Difference with previous row
df = pd.DataFrame({'a': [1, 2, 3, 4, 5, 6], ... 'b': [1, 1, 2, 3, 5, 8], ... 'c': [1, 4, 9, 16, 25, 36]}) df a b c 0 1 1 1 1 2 1 4 2 3 2 9 3 4 3 16 4 5 5 25 5 6 8 36
df.diff() a b c 0 NaN NaN NaN 1 1.0 0.0 3.0 2 1.0 1.0 5.0 3 1.0 1.0 7.0 4 1.0 2.0 9.0 5 1.0 3.0 11.0
Difference with previous column
df.diff(axis=1) a b c 0 NaN 0 0 1 NaN -1 3 2 NaN -1 7 3 NaN -1 13 4 NaN 0 20 5 NaN 2 28
Difference with 3rd previous row
df.diff(periods=3) a b c 0 NaN NaN NaN 1 NaN NaN NaN 2 NaN NaN NaN 3 3.0 2.0 15.0 4 3.0 4.0 21.0 5 3.0 6.0 27.0
Difference with following row
df.diff(periods=-1) a b c 0 -1.0 0.0 -3.0 1 -1.0 -1.0 -5.0 2 -1.0 -1.0 -7.0 3 -1.0 -2.0 -9.0 4 -1.0 -3.0 -11.0 5 NaN NaN NaN
Overflow in input dtype
df = pd.DataFrame({'a': [1, 0]}, dtype=np.uint8) df.diff() a 0 NaN 1 255.0