pandas.DataFrame.diff — pandas 3.0.0.dev0+2102.g839747c3f6 documentation (original) (raw)

DataFrame.diff(periods=1, axis=0)[source]#

First discrete difference of element.

Calculates the difference of a DataFrame element compared with another element in the DataFrame (default is element in previous row).

Parameters:

periodsint, default 1

Periods to shift for calculating difference, accepts negative values.

axis{0 or ‘index’, 1 or ‘columns’}, default 0

Take difference over rows (0) or columns (1).

Returns:

DataFrame

First differences of the Series.

Notes

For boolean dtypes, this uses operator.xor() rather thanoperator.sub(). The result is calculated according to current dtype in DataFrame, however dtype of the result is always float64.

Examples

Difference with previous row

df = pd.DataFrame({'a': [1, 2, 3, 4, 5, 6], ... 'b': [1, 1, 2, 3, 5, 8], ... 'c': [1, 4, 9, 16, 25, 36]}) df a b c 0 1 1 1 1 2 1 4 2 3 2 9 3 4 3 16 4 5 5 25 5 6 8 36

df.diff() a b c 0 NaN NaN NaN 1 1.0 0.0 3.0 2 1.0 1.0 5.0 3 1.0 1.0 7.0 4 1.0 2.0 9.0 5 1.0 3.0 11.0

Difference with previous column

df.diff(axis=1) a b c 0 NaN 0 0 1 NaN -1 3 2 NaN -1 7 3 NaN -1 13 4 NaN 0 20 5 NaN 2 28

Difference with 3rd previous row

df.diff(periods=3) a b c 0 NaN NaN NaN 1 NaN NaN NaN 2 NaN NaN NaN 3 3.0 2.0 15.0 4 3.0 4.0 21.0 5 3.0 6.0 27.0

Difference with following row

df.diff(periods=-1) a b c 0 -1.0 0.0 -3.0 1 -1.0 -1.0 -5.0 2 -1.0 -1.0 -7.0 3 -1.0 -2.0 -9.0 4 -1.0 -3.0 -11.0 5 NaN NaN NaN

Overflow in input dtype

df = pd.DataFrame({'a': [1, 0]}, dtype=np.uint8) df.diff() a 0 NaN 1 255.0